SlideShare a Scribd company logo
JDBC – We don’t need no
 stinking JDBC. How
 LinkedIn uses memcached,
 a spoonful of SOA, and a
 sprinkle of SQL to scale.
David Raccah & Dhananjay Ragade
LinkedIn Corporation
Goal of this Presentation
What you will learn
How LinkedIn built a cheap
 and scalable system to
 store our member’s profiles,
 and how you can do the
 same




                                2
Agenda

>   Review system ilities
>   What happened to databases?
>   SOA What
>   Discuss existing Best Practices
>   Pixie Dust and Kool-Aid are not so bad
>   What LinkedIn’s got up their sleeve
>   How it all came together…
>   Q&A


                                             3
Terminology of the ilities
the terms of large successful systems
>   Performance
       Not an “ility” but without it, no ility will save you
>   Availability
       Availability is the proportion of time a system is in a
        functioning condition
>   Reliability
       The probability that a functional unit will perform its
        required function for a specified interval under
        stated conditions.
       The ability of something to "fail well" (fail without
        catastrophic consequences)


                                                                  4
Terminology of the ilities
the terms of large successful systems
>   Scalability
       Slow with multiple users vs. single user
>   Manageability
       The ability to manage all parts of a large moving
        system
>   Serviceability
       The ability to service an arm of the system without
        bleeding to death (e.g. change out a database from
        a working system). Bleeding is OK in a high
        performance system – death is NOT acceptable.


                                                              5
Agenda

>   Review system ilities
>   What happened to databases?
>   SOA What
>   Discuss existing Best Practices
>   Pixie Dust and Kool-Aid are not so bad
>   What LinkedIn’s got up their sleeve
>   How it all came together…
>   Q&A


                                             6
Databases
The systems that drive the enterprise … or….
>   RDBMS – Relational Data
    Base Management System
    Attribute
>   KVSS – Key Value Storage
    System
>   Enterprise Search Engines




                                               7
Database Server History….




                            8
Database mind set has changed…
From data access to data management to….
>   Initially it was all about remote
    data access with an index
>   Then it moved to ACID data
    management and tooling
>   Then it became an Application
    Server with data affinity
>   Now we have come full circle
    and people have figured out
    that scaling is more important
    than relationships, transactions,
    and data and behavioral affinity.


                                           9
Database Mantra that Rule the Roost
ACID
>   Atomicity – All or nothing
>   Consistency – Data in the
    system should never get in a
    contradictory state.
>   Isolation: Two requests
    cannot interfere with one
    another.
>   Durability: No do over – once
    the data is persisted, it
    cannot change.

                                      10
Anti-Database Rules
BASE
>   Basically Available
       Support partial failures within your
        architecture (e.g. sharding)
>   Soft state
       State may be out of synch for
        some time
>   Eventually consistent
       Eventually all data is made
        consistent (as long as the
        hardware is reliable)


                                               11
Database Scalability
Or lack thereof…
>   Databases work. Look at:
       Hotmail
       Facebook
       eBay
>   Databases scale with hardware
>   They do not scale horizontally
    well
       Partition management is
        nonexistent and RYO is a mess
       Many use them as ISAM and
        not even relational

                                        12
Database Tools and language
Duh…
>   Defacto standards for tools and
    languages abound for relational
    databases
>   Easy to manage the data within
    a partition and easy to write
    code to operate on said data
>   Terrifying but nice to use
    extensions include running
    Java within the Data Engine, so
    that you could run your
    application within the big iron


                                      13
Database’s other features
Which are the pain points….
> Constraints – Nice idea until
  you start partitioning.
  2PC is the anti-scalability
  pattern (Pat Helland)
> Computation – this feature turns out to cause more
  pain as cost rises with scale and are incompatible
  with most languages and tools.
> Replication & backup
       Nice tools that are indeed important and useful
>   ACL support & Data Engine optimizations
       Used for sure, but exist to circumvent deficiencies

                                                              14
Key Value Storage Systems
BigTable, Hive, Dynamo– the Wild Wild West
>   Reliable – Proven on web
>   Available – redundant (locally)
>   Scalable – no constraints
>   Limited ACIDity
>   No Standard and not portable
>   Almost no:
       Constraints or relationships
       Computation or transactions


                                             15
Enterprise Search Engines
Index yes – storage device no
>   A great inverted index
>   Finds data quickly
>   However, what it returns is
    commonly an ID to the
    entity(s) in question
>   Real-Time solutions are
    available but not fully
    deployed today
>   Limited ACIDity/transactions
>   Scalable, available, reliable
                                    16
Agenda

>   Review system ilities
>   What happened to databases?
>   SOA What
>   Discuss existing Best Practices
>   Pixie Dust and Kool-Aid are not so bad
>   What LinkedIn’s got up their sleeve
>   How it all came together…
>   Q&A


                                             17
SOA
Service Oriented Architecture
>   SOA may be overkill for most
    enterprises
>   Still a Tiered and layered
    architecture – which is what
    SOA hoped to formulate and
    standardize is a solid approach
>   Services (not SOA) allow for
    efficient reuse of business
    processes and aggregation
    services within a complex
    development organization
                                      18
Agenda

>   Review system ilities
>   What happened to databases?
>   SOA What
>   Discuss existing Best Practices
>   Pixie Dust and Kool-Aid are not so bad
>   What LinkedIn’s got up their sleeve
>   How it all came together…
>   Q&A


                                             19
Best Practices
Storage and architecture
>   Store critical data redundantly
    and reliably with a cluster
       Google via BigTable, Facebook
        via MySQL, eBay via replicated &
        sharded DB
>   Layer services on top of the
    storage device to manage data
    integrity and complexity
       LinkedIn, Amazon, eBay



                                           20
Best Practices
Storage and architecture
>   Create a bus to route
    replicated data to consumers –
    e.g. search, data mining, etc.
       Almost all sites
>   Parallelization via things like
    scatter/gather
       Almost all search topologies
        (Google, Yahoo, Live),
       Facebook, etc.


                                       21
Best Practices
Storage and architecture
>   Keep the system stateless
       eBay, Google, etc.
>   Partition data and services
       Facebook, eBay
>   Cache data
>   Replicate your data
>   Route requests to where the
    behavior and/or data exists
>   Degrade gracefully with load

                                   22
Best Practices
Storage and architecture
>   Tiering systems
       Latency vs. Affinity
           Traversal versus affinity – you need to
            analyze the cost and make a decision
       Scaling vs. parallelizing
           Do you need to keep tiering all
            systems to keep the scalability
            uniform?
       Complexity vs. diminished
        dependencies
           Does the reduced dependencies make
            up for the increased system
            complexity?


                                                      23
Agenda

>   Review system ilities
>   What happened to databases?
>   SOA What
>   Discuss existing Best Practices
>   Pixie Dust and Kool-Aid are not so bad
>   What LinkedIn’s got up their sleeve
>   How it all came together…
>   Q&A


                                             24
Pixie Dust and Kool-Aid
Building on the past




                          25
Pixie Dust and Kool-Aid
Building on the past
> So what do we want:
    Reliable
    Available
    Scalable
    ACIDity on simple transactions
    Standard and portable interface
    Data Optimizations
    Cache and replicate
    Low cost BASE architecture


                                       26
Agenda

>   Review system ilities
>   What happened to databases?
>   SOA What
>   Discuss existing Best Practices
>   Pixie Dust and Kool-Aid are not so bad
>   What LinkedIn’s got up their sleeve
>   How it all came together…
>   Q&A


                                             27
LinkedIn’s Data Services
Mixture of standards and pixie dust
>   Front a database with a service
>   Cache data
>   Route to and partition the data
    service
>   Scale and replicate services in a
    horizontal manner
>   Keep all writes ACID and
    subsequent reads ACID as well



                                        28
LinkedIn’s Data Services
Mixture of standards and pixie dust
>   Databases are reliable
>   Scale out at the service
>   Replicate and cache
>   Partitioning comes from the front
    tier and business servers that
    front the data services




                                        29
LinkedIn’s Data Services
Immediate replication vs. eventual replication
>   Caching needs a consistency algorithm
>   Techniques for immediate replication
       Paxos
           Chubby, Microsoft AutoPilot, Zoo Keeper
       N Phase Commit (2PC and 3PC)
>   Techniques for eventual consistency
       BASE (Basically Available, Soft-state,
        Eventual Consistency
           Inktomi, Dynamo, AWS



                                                      30
LinkedIn’s Data Services
LinkedIn’s approach
>   Keep core data ACID
>   Keep replicated and cached data BASE
>   Replicate data via the data bus
>   Cache data on a cheap memory
    (memcached)
>   Use a hint to route the client to his /
    her’s ACID data




                                              31
LinkedIn’s Data Services
Databus – the linchpin of our replication




                                            32
LinkedIn’s Data Services
LinkedIn’s approach




                           33
LinkedIn’s Data Services
Core DS
>   Keep core data ACID in the DB
>   All writes come here.
>   Databus source for all replication
>   The last line of defense for a
    cache miss
>   Manages sharding




                                         34
LinkedIn’s Data Services
RepDS
>   Manages cache consistency and
    replication
>   Manages the freshness of the
    caller
>   Reads come from cache




                                    35
LinkedIn’s Data Services
RepReader
>   RepReader is the typical tip of the
    iceberg problem
>   All read operations are sourced
    from the cache unless the caller’s
    freshness token is out of the window




                                           36
LinkedIn’s Data Services
Freshness Token (AKA Pixie Dust)
>   The freshness token = Pixie Dust for
    CUD operations
>   It also allows us to give the caller
    control over whether they are content
    with BASE data, even if they did no
    CUD operation.




                                            37
LinkedIn’s Data Services
For the love of Pixie dust and Kool-Aid
>   We use commodity hardware and
    software to run our service
>   We use Pixie Dust to keep costs down
    and keep our customer happy
>   We keep OPS and the exec-staff
    happy with our special brand of Kool-
    Aid




                                            38
Agenda

>   Review system ilities
>   What happened to databases?
>   SOA What
>   Discuss existing Best Practices
>   Pixie Dust and Kool-Aid are not so bad
>   What LinkedIn’s got up their sleeve
>   How it all came together…
>   Q&A


                                             39
Profile Re-architecture
Changing planes in mid-flight

>   Original LinkedIn System
>   Use of XML for i18n
>   Phased Transition




                                40
Problems from the original system
Anthropology 101
>   Be fair… it worked well for
    a startup
>   Many tables in one big
    DB
>   Too many similar object
    hierarchies
>   No well defined domains




                                    41
Why XML?
Flexibility
>   Profile has many fields
>   1NF for I18n ==> too many
    tables
>   StAX for fast parsing
>   Easier to version the profile
>   Human readable
>   JSON? ProtoBuf?




                                    42
Issues with XML
<good/> <bad/> <ugly/>
>   XML schema design tradeoffs
    and analytics impact
>   XML is verbose
>   StAX is unfriendly
>   XML in the DB caused us
    some performance
    headaches




                                  43
Phased Transition
Evolving a living, breathing organism
>   Successive iterations avoid breakages
>   No major site downtime
>   Easier to sanity check
>   Does not hold other teams hostage
>   Phases LinkedIn went through




                                            44
Double Writes Topology
Safety first




                         45
After Legacy Tables Dropped
Auld Lang Syne




                              46
Wrap up
The moral of the story is…
>   Keep your system BASE
>   Use commodity hardware
>   Use pixie dust (AKA data freshness token)
>   Evolve slowly - no big bang!




                                                47
Q&A




      48
David Raccah & Dhananjay Ragade
draccah@linkedin.com
dragade@linkedin.com
Linkedin Corporation




                             49
Appendix
Performance
Often mixed up with scalability
>   Performance
       A numerical value given to
        a single system when
        asked to do a task under
        nominal load
       If the system responds
        poorly without load, it will
        assuredly continue its
        molasses response time
        under load




                                       51
Availability
Often mixed up with reliability
>   Availability
       A numerical value
        given to a system
        that defines the
        proportion of time a
        system is in a
        functioning condition.
       Most common
        scoring system is
        called nines – which is defined as the uptime versus
        the uptime and downtime – five nines = 0.99999


                                                               52
Reliability
The ability for a system to perform its functionality
>   Reliability
       A system can be 100% available
        and still be 100% unreliable (e.g.
        non consistent caching)
       A person can consistently give
        you the wrong answer
       Architecture is defined as the
        balance of the ilities and cost




                                                        53
Scalability
the term that many think is the holy grail
>   Scalability
       The ability for a system to manage
        more traffic or to be “scaled” as
        more traffic appears
       System slows with multiple users
        vs. single user
       Route, Partition, Orchestrate,
        replicate, and go asynch
       Split the system horizontally
       Rarely scale vertically




                                             54
The rest of the ilities
the ones that people tend to ignore till its too late
>   Manageability
       It is a double-edged
        sword which can be
        easily ignored
>   Serviceability
       Here complexity starts to
        rear its ugly head
>   Maintainability
       Of course maintainability
        tends to run upstream of
        complexity


                                                        55

More Related Content

What's hot

Netezza TwinFin12 Architecture Administration
Netezza TwinFin12 Architecture AdministrationNetezza TwinFin12 Architecture Administration
Netezza TwinFin12 Architecture Administration
Braja Krishna Das
 
Microsoft Fabric: How to Accelerate AI with Data
Microsoft Fabric: How to Accelerate AI with DataMicrosoft Fabric: How to Accelerate AI with Data
Microsoft Fabric: How to Accelerate AI with Data
BrainSell Technologies
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
Luke Han
 
Disaster Recovery Cook Book
Disaster Recovery Cook BookDisaster Recovery Cook Book
Disaster Recovery Cook Book
PT Datacomm Diangraha
 
Maintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix APIMaintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix API
Daniel Jacobson
 
dbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchezdbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchez
GoDataDriven
 
Databases in the Cloud
Databases in the CloudDatabases in the Cloud
Databases in the Cloud
Amazon Web Services
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
DataWorks Summit
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop Migration
Databricks
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
KafkaZone
 
What’s New in Oracle Database 19c - Part 1
What’s New in Oracle Database 19c - Part 1What’s New in Oracle Database 19c - Part 1
What’s New in Oracle Database 19c - Part 1
Satishbabu Gunukula
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Databricks
 
The Basics of Automotive Ethernet Webinar Slidedeck
The Basics of Automotive Ethernet Webinar SlidedeckThe Basics of Automotive Ethernet Webinar Slidedeck
The Basics of Automotive Ethernet Webinar Slidedeck
teledynelecroy
 
Oracle Database Exadata Cloud Service Conference
Oracle Database Exadata Cloud Service ConferenceOracle Database Exadata Cloud Service Conference
Oracle Database Exadata Cloud Service Conference
Okcan Yasin Saygılı
 
Azure: PaaS or IaaS
Azure: PaaS or IaaSAzure: PaaS or IaaS
Azure: PaaS or IaaS
Shahed Chowdhuri
 
Azure Storage
Azure StorageAzure Storage
Azure Storage
Mustafa
 
The Power of HPC with Next Generation Supermicro Systems
The Power of HPC with Next Generation Supermicro Systems The Power of HPC with Next Generation Supermicro Systems
The Power of HPC with Next Generation Supermicro Systems
Rebekah Rodriguez
 
Delivering Supermicro Software Defined Storage Solutions with OSNexus QuantaStor
Delivering Supermicro Software Defined Storage Solutions with OSNexus QuantaStorDelivering Supermicro Software Defined Storage Solutions with OSNexus QuantaStor
Delivering Supermicro Software Defined Storage Solutions with OSNexus QuantaStor
Rebekah Rodriguez
 
De-Risk Your Digital Transformation — And Reduce Time, Cost & Complexity
De-Risk Your Digital Transformation — And Reduce Time, Cost & ComplexityDe-Risk Your Digital Transformation — And Reduce Time, Cost & Complexity
De-Risk Your Digital Transformation — And Reduce Time, Cost & Complexity
Inductive Automation
 
Zero data Loss Recovery Appliance
Zero data Loss Recovery ApplianceZero data Loss Recovery Appliance
Zero data Loss Recovery Appliance
Trivadis
 

What's hot (20)

Netezza TwinFin12 Architecture Administration
Netezza TwinFin12 Architecture AdministrationNetezza TwinFin12 Architecture Administration
Netezza TwinFin12 Architecture Administration
 
Microsoft Fabric: How to Accelerate AI with Data
Microsoft Fabric: How to Accelerate AI with DataMicrosoft Fabric: How to Accelerate AI with Data
Microsoft Fabric: How to Accelerate AI with Data
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
 
Disaster Recovery Cook Book
Disaster Recovery Cook BookDisaster Recovery Cook Book
Disaster Recovery Cook Book
 
Maintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix APIMaintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix API
 
dbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchezdbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchez
 
Databases in the Cloud
Databases in the CloudDatabases in the Cloud
Databases in the Cloud
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop Migration
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
 
What’s New in Oracle Database 19c - Part 1
What’s New in Oracle Database 19c - Part 1What’s New in Oracle Database 19c - Part 1
What’s New in Oracle Database 19c - Part 1
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
The Basics of Automotive Ethernet Webinar Slidedeck
The Basics of Automotive Ethernet Webinar SlidedeckThe Basics of Automotive Ethernet Webinar Slidedeck
The Basics of Automotive Ethernet Webinar Slidedeck
 
Oracle Database Exadata Cloud Service Conference
Oracle Database Exadata Cloud Service ConferenceOracle Database Exadata Cloud Service Conference
Oracle Database Exadata Cloud Service Conference
 
Azure: PaaS or IaaS
Azure: PaaS or IaaSAzure: PaaS or IaaS
Azure: PaaS or IaaS
 
Azure Storage
Azure StorageAzure Storage
Azure Storage
 
The Power of HPC with Next Generation Supermicro Systems
The Power of HPC with Next Generation Supermicro Systems The Power of HPC with Next Generation Supermicro Systems
The Power of HPC with Next Generation Supermicro Systems
 
Delivering Supermicro Software Defined Storage Solutions with OSNexus QuantaStor
Delivering Supermicro Software Defined Storage Solutions with OSNexus QuantaStorDelivering Supermicro Software Defined Storage Solutions with OSNexus QuantaStor
Delivering Supermicro Software Defined Storage Solutions with OSNexus QuantaStor
 
De-Risk Your Digital Transformation — And Reduce Time, Cost & Complexity
De-Risk Your Digital Transformation — And Reduce Time, Cost & ComplexityDe-Risk Your Digital Transformation — And Reduce Time, Cost & Complexity
De-Risk Your Digital Transformation — And Reduce Time, Cost & Complexity
 
Zero data Loss Recovery Appliance
Zero data Loss Recovery ApplianceZero data Loss Recovery Appliance
Zero data Loss Recovery Appliance
 

Viewers also liked

LinkedIn - A Professional Network built with Java Technologies and Agile Prac...
LinkedIn - A Professional Network built with Java Technologies and Agile Prac...LinkedIn - A Professional Network built with Java Technologies and Agile Prac...
LinkedIn - A Professional Network built with Java Technologies and Agile Prac...
LinkedIn
 
Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.
Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips. Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.
Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.
Hakka Labs
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
Ahmed Banafa
 
LinkedIn for B2B
LinkedIn for B2BLinkedIn for B2B
LinkedIn for B2B
Jared Degnan
 
Developing a LinkedIn Recruiting Strategy
Developing a LinkedIn Recruiting StrategyDeveloping a LinkedIn Recruiting Strategy
Developing a LinkedIn Recruiting Strategy
aparton
 
Social recruiting seminar maximizing linked in and facebook for recruiting ...
Social recruiting seminar   maximizing linked in and facebook for recruiting ...Social recruiting seminar   maximizing linked in and facebook for recruiting ...
Social recruiting seminar maximizing linked in and facebook for recruiting ...
HireClix
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data Products
Peter Skomoroch
 
Social Media Data Mining
Social Media Data MiningSocial Media Data Mining
Social Media Data Mining
Teresa Rothaar
 
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkStrata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Daniel Tunkelang
 
Linkedin Analytics Week 11 MKT 9715 baruch mba program Prof Marshall Sponder
Linkedin Analytics Week 11 MKT 9715 baruch mba program Prof Marshall SponderLinkedin Analytics Week 11 MKT 9715 baruch mba program Prof Marshall Sponder
Linkedin Analytics Week 11 MKT 9715 baruch mba program Prof Marshall Sponder
Marshall Sponder
 
تقرير عن شبكات التواصل الاجتماعية
تقرير عن شبكات التواصل الاجتماعيةتقرير عن شبكات التواصل الاجتماعية
تقرير عن شبكات التواصل الاجتماعية
ssaa1430
 
الملخص الرقمي - الربع الثالث 2014
الملخص الرقمي - الربع الثالث 2014الملخص الرقمي - الربع الثالث 2014
الملخص الرقمي - الربع الثالث 2014
MOTC Qatar
 
Business Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Business Applications of Predictive Modeling at Scale - KDD 2016 TutorialBusiness Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Business Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Qiang Zhu
 
تحليل المعلومات في الشبكات الإجتماعية
تحليل المعلومات في الشبكات الإجتماعيةتحليل المعلومات في الشبكات الإجتماعية
تحليل المعلومات في الشبكات الإجتماعية
Riyadh Geeks
 
LinkedIn Demographics & Statistics - Jan 2012
LinkedIn Demographics & Statistics - Jan 2012LinkedIn Demographics & Statistics - Jan 2012
LinkedIn Demographics & Statistics - Jan 2012
Amodiovalerio Verde
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
Datamining Tools
 
LinkedIn Strategies for Recruiting: A Case Study
LinkedIn Strategies for Recruiting: A Case StudyLinkedIn Strategies for Recruiting: A Case Study
LinkedIn Strategies for Recruiting: A Case Study
Kara Yarnot
 
The Play Framework at LinkedIn
The Play Framework at LinkedInThe Play Framework at LinkedIn
The Play Framework at LinkedIn
Yevgeniy Brikman
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Q2 2014 LinkedIn Economic Confidence Outlook
Q2 2014 LinkedIn Economic Confidence OutlookQ2 2014 LinkedIn Economic Confidence Outlook
Q2 2014 LinkedIn Economic Confidence Outlook
LinkedIn
 

Viewers also liked (20)

LinkedIn - A Professional Network built with Java Technologies and Agile Prac...
LinkedIn - A Professional Network built with Java Technologies and Agile Prac...LinkedIn - A Professional Network built with Java Technologies and Agile Prac...
LinkedIn - A Professional Network built with Java Technologies and Agile Prac...
 
Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.
Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips. Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.
Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
 
LinkedIn for B2B
LinkedIn for B2BLinkedIn for B2B
LinkedIn for B2B
 
Developing a LinkedIn Recruiting Strategy
Developing a LinkedIn Recruiting StrategyDeveloping a LinkedIn Recruiting Strategy
Developing a LinkedIn Recruiting Strategy
 
Social recruiting seminar maximizing linked in and facebook for recruiting ...
Social recruiting seminar   maximizing linked in and facebook for recruiting ...Social recruiting seminar   maximizing linked in and facebook for recruiting ...
Social recruiting seminar maximizing linked in and facebook for recruiting ...
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data Products
 
Social Media Data Mining
Social Media Data MiningSocial Media Data Mining
Social Media Data Mining
 
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkStrata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of Microwork
 
Linkedin Analytics Week 11 MKT 9715 baruch mba program Prof Marshall Sponder
Linkedin Analytics Week 11 MKT 9715 baruch mba program Prof Marshall SponderLinkedin Analytics Week 11 MKT 9715 baruch mba program Prof Marshall Sponder
Linkedin Analytics Week 11 MKT 9715 baruch mba program Prof Marshall Sponder
 
تقرير عن شبكات التواصل الاجتماعية
تقرير عن شبكات التواصل الاجتماعيةتقرير عن شبكات التواصل الاجتماعية
تقرير عن شبكات التواصل الاجتماعية
 
الملخص الرقمي - الربع الثالث 2014
الملخص الرقمي - الربع الثالث 2014الملخص الرقمي - الربع الثالث 2014
الملخص الرقمي - الربع الثالث 2014
 
Business Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Business Applications of Predictive Modeling at Scale - KDD 2016 TutorialBusiness Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Business Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
 
تحليل المعلومات في الشبكات الإجتماعية
تحليل المعلومات في الشبكات الإجتماعيةتحليل المعلومات في الشبكات الإجتماعية
تحليل المعلومات في الشبكات الإجتماعية
 
LinkedIn Demographics & Statistics - Jan 2012
LinkedIn Demographics & Statistics - Jan 2012LinkedIn Demographics & Statistics - Jan 2012
LinkedIn Demographics & Statistics - Jan 2012
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
LinkedIn Strategies for Recruiting: A Case Study
LinkedIn Strategies for Recruiting: A Case StudyLinkedIn Strategies for Recruiting: A Case Study
LinkedIn Strategies for Recruiting: A Case Study
 
The Play Framework at LinkedIn
The Play Framework at LinkedInThe Play Framework at LinkedIn
The Play Framework at LinkedIn
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Q2 2014 LinkedIn Economic Confidence Outlook
Q2 2014 LinkedIn Economic Confidence OutlookQ2 2014 LinkedIn Economic Confidence Outlook
Q2 2014 LinkedIn Economic Confidence Outlook
 

Similar to How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale

RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.
Girish. N. Raghavan
 
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
Lucas Jellema
 
Big Data Boom
Big Data BoomBig Data Boom
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
kcmallu
 
Oracle Database 12c Multitenant for Consolidation
Oracle Database 12c Multitenant for ConsolidationOracle Database 12c Multitenant for Consolidation
Oracle Database 12c Multitenant for Consolidation
Yudi Herdiana
 
The IBM dashboard for operational metrics
The IBM dashboard for operational metricsThe IBM dashboard for operational metrics
The IBM dashboard for operational metrics
Platform CF
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
Mohamed Galal
 
50 Shades of Data – how, when and why Big,Relational,NoSQL,Elastic,Graph,Even...
50 Shades of Data – how, when and why Big,Relational,NoSQL,Elastic,Graph,Even...50 Shades of Data – how, when and why Big,Relational,NoSQL,Elastic,Graph,Even...
50 Shades of Data – how, when and why Big,Relational,NoSQL,Elastic,Graph,Even...
Lucas Jellema
 
Mirroring and replications
Mirroring and replicationsMirroring and replications
Mirroring and replications
Kartik Kalpande Patil
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructure
projectandppt
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
elliando dias
 
Chapter 1 Database Systems.pptx
Chapter 1 Database Systems.pptxChapter 1 Database Systems.pptx
Chapter 1 Database Systems.pptx
MaxamedAbiib1
 
Understanding Big Data And Hadoop
Understanding Big Data And HadoopUnderstanding Big Data And Hadoop
Understanding Big Data And Hadoop
Edureka!
 
Intro to big data and applications -day 3
Intro to big data and applications -day 3Intro to big data and applications -day 3
Intro to big data and applications -day 3
Parviz Vakili
 
Data
DataData
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
2016 - 10 questions you should answer before building a new microservice
2016 - 10 questions you should answer before building a new microservice2016 - 10 questions you should answer before building a new microservice
2016 - 10 questions you should answer before building a new microservice
devopsdaysaustin
 
Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)
Denodo
 
NoSQL and Couchbase
NoSQL and CouchbaseNoSQL and Couchbase
NoSQL and Couchbase
Sangharsh agarwal
 
Myths & Reality - Choose a DBMS tailored to your use cases
Myths & Reality - Choose a DBMS tailored to your use casesMyths & Reality - Choose a DBMS tailored to your use cases
Myths & Reality - Choose a DBMS tailored to your use cases
OVHcloud
 

Similar to How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale (20)

RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.
 
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
 
Big Data Boom
Big Data BoomBig Data Boom
Big Data Boom
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Oracle Database 12c Multitenant for Consolidation
Oracle Database 12c Multitenant for ConsolidationOracle Database 12c Multitenant for Consolidation
Oracle Database 12c Multitenant for Consolidation
 
The IBM dashboard for operational metrics
The IBM dashboard for operational metricsThe IBM dashboard for operational metrics
The IBM dashboard for operational metrics
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
 
50 Shades of Data – how, when and why Big,Relational,NoSQL,Elastic,Graph,Even...
50 Shades of Data – how, when and why Big,Relational,NoSQL,Elastic,Graph,Even...50 Shades of Data – how, when and why Big,Relational,NoSQL,Elastic,Graph,Even...
50 Shades of Data – how, when and why Big,Relational,NoSQL,Elastic,Graph,Even...
 
Mirroring and replications
Mirroring and replicationsMirroring and replications
Mirroring and replications
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructure
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
Chapter 1 Database Systems.pptx
Chapter 1 Database Systems.pptxChapter 1 Database Systems.pptx
Chapter 1 Database Systems.pptx
 
Understanding Big Data And Hadoop
Understanding Big Data And HadoopUnderstanding Big Data And Hadoop
Understanding Big Data And Hadoop
 
Intro to big data and applications -day 3
Intro to big data and applications -day 3Intro to big data and applications -day 3
Intro to big data and applications -day 3
 
Data
DataData
Data
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
2016 - 10 questions you should answer before building a new microservice
2016 - 10 questions you should answer before building a new microservice2016 - 10 questions you should answer before building a new microservice
2016 - 10 questions you should answer before building a new microservice
 
Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)
 
NoSQL and Couchbase
NoSQL and CouchbaseNoSQL and Couchbase
NoSQL and Couchbase
 
Myths & Reality - Choose a DBMS tailored to your use cases
Myths & Reality - Choose a DBMS tailored to your use casesMyths & Reality - Choose a DBMS tailored to your use cases
Myths & Reality - Choose a DBMS tailored to your use cases
 

More from LinkedIn

How LinkedIn is Transforming Businesses
How LinkedIn is Transforming BusinessesHow LinkedIn is Transforming Businesses
How LinkedIn is Transforming Businesses
LinkedIn
 
Networking on LinkedIn 101
Networking on LinkedIn 101Networking on LinkedIn 101
Networking on LinkedIn 101
LinkedIn
 
5 تحديثات على ملفك في 5 دقائق
5 تحديثات على ملفك في 5 دقائق5 تحديثات على ملفك في 5 دقائق
5 تحديثات على ملفك في 5 دقائق
LinkedIn
 
5 LinkedIn Profile Updates in 5 Minutes
5 LinkedIn Profile Updates in 5 Minutes5 LinkedIn Profile Updates in 5 Minutes
5 LinkedIn Profile Updates in 5 Minutes
LinkedIn
 
The Student's Guide to LinkedIn
The Student's Guide to LinkedInThe Student's Guide to LinkedIn
The Student's Guide to LinkedIn
LinkedIn
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017
LinkedIn
 
Accelerating LinkedIn’s Vision Through Innovation
Accelerating LinkedIn’s Vision Through InnovationAccelerating LinkedIn’s Vision Through Innovation
Accelerating LinkedIn’s Vision Through Innovation
LinkedIn
 
How To Tell Your #workstory
How To Tell Your #workstoryHow To Tell Your #workstory
How To Tell Your #workstory
LinkedIn
 
LinkedIn Q1 2016 Earnings Call
LinkedIn Q1 2016 Earnings CallLinkedIn Q1 2016 Earnings Call
LinkedIn Q1 2016 Earnings Call
LinkedIn
 
The 2016 LinkedIn Job Search Guide
The 2016 LinkedIn Job Search GuideThe 2016 LinkedIn Job Search Guide
The 2016 LinkedIn Job Search Guide
LinkedIn
 
LinkedIn Q4 2015 Earnings Call
LinkedIn Q4 2015 Earnings CallLinkedIn Q4 2015 Earnings Call
LinkedIn Q4 2015 Earnings Call
LinkedIn
 
Banish The Buzzwords
Banish The BuzzwordsBanish The Buzzwords
Banish The Buzzwords
LinkedIn
 
LinkedIn Bring In Your Parents Day 2015 - Your Parents' Best Career Advice
LinkedIn Bring In Your Parents Day 2015 - Your Parents' Best Career AdviceLinkedIn Bring In Your Parents Day 2015 - Your Parents' Best Career Advice
LinkedIn Bring In Your Parents Day 2015 - Your Parents' Best Career Advice
LinkedIn
 
LinkedIn Q3 2015 Earnings Call
LinkedIn Q3 2015 Earnings CallLinkedIn Q3 2015 Earnings Call
LinkedIn Q3 2015 Earnings Call
LinkedIn
 
LinkedIn Economic Graph Research: Toronto
LinkedIn Economic Graph Research: TorontoLinkedIn Economic Graph Research: Toronto
LinkedIn Economic Graph Research: Toronto
LinkedIn
 
Freelancers Are LinkedIn Power Users [Infographic]
Freelancers Are LinkedIn Power Users [Infographic]Freelancers Are LinkedIn Power Users [Infographic]
Freelancers Are LinkedIn Power Users [Infographic]
LinkedIn
 
Top Industries for Freelancers on LinkedIn [Infographic]
Top Industries for Freelancers on LinkedIn [Infographic]Top Industries for Freelancers on LinkedIn [Infographic]
Top Industries for Freelancers on LinkedIn [Infographic]
LinkedIn
 
LinkedIn Quiz: Which Parent Are You When It Comes to Helping Guide Your Child...
LinkedIn Quiz: Which Parent Are You When It Comes to Helping Guide Your Child...LinkedIn Quiz: Which Parent Are You When It Comes to Helping Guide Your Child...
LinkedIn Quiz: Which Parent Are You When It Comes to Helping Guide Your Child...
LinkedIn
 
LinkedIn Connect to Opportunity™ -- Stories of Discovery
LinkedIn Connect to Opportunity™ -- Stories of DiscoveryLinkedIn Connect to Opportunity™ -- Stories of Discovery
LinkedIn Connect to Opportunity™ -- Stories of Discovery
LinkedIn
 
LinkedIn Q2 2015 Earnings Call
LinkedIn Q2 2015 Earnings CallLinkedIn Q2 2015 Earnings Call
LinkedIn Q2 2015 Earnings Call
LinkedIn
 

More from LinkedIn (20)

How LinkedIn is Transforming Businesses
How LinkedIn is Transforming BusinessesHow LinkedIn is Transforming Businesses
How LinkedIn is Transforming Businesses
 
Networking on LinkedIn 101
Networking on LinkedIn 101Networking on LinkedIn 101
Networking on LinkedIn 101
 
5 تحديثات على ملفك في 5 دقائق
5 تحديثات على ملفك في 5 دقائق5 تحديثات على ملفك في 5 دقائق
5 تحديثات على ملفك في 5 دقائق
 
5 LinkedIn Profile Updates in 5 Minutes
5 LinkedIn Profile Updates in 5 Minutes5 LinkedIn Profile Updates in 5 Minutes
5 LinkedIn Profile Updates in 5 Minutes
 
The Student's Guide to LinkedIn
The Student's Guide to LinkedInThe Student's Guide to LinkedIn
The Student's Guide to LinkedIn
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017
 
Accelerating LinkedIn’s Vision Through Innovation
Accelerating LinkedIn’s Vision Through InnovationAccelerating LinkedIn’s Vision Through Innovation
Accelerating LinkedIn’s Vision Through Innovation
 
How To Tell Your #workstory
How To Tell Your #workstoryHow To Tell Your #workstory
How To Tell Your #workstory
 
LinkedIn Q1 2016 Earnings Call
LinkedIn Q1 2016 Earnings CallLinkedIn Q1 2016 Earnings Call
LinkedIn Q1 2016 Earnings Call
 
The 2016 LinkedIn Job Search Guide
The 2016 LinkedIn Job Search GuideThe 2016 LinkedIn Job Search Guide
The 2016 LinkedIn Job Search Guide
 
LinkedIn Q4 2015 Earnings Call
LinkedIn Q4 2015 Earnings CallLinkedIn Q4 2015 Earnings Call
LinkedIn Q4 2015 Earnings Call
 
Banish The Buzzwords
Banish The BuzzwordsBanish The Buzzwords
Banish The Buzzwords
 
LinkedIn Bring In Your Parents Day 2015 - Your Parents' Best Career Advice
LinkedIn Bring In Your Parents Day 2015 - Your Parents' Best Career AdviceLinkedIn Bring In Your Parents Day 2015 - Your Parents' Best Career Advice
LinkedIn Bring In Your Parents Day 2015 - Your Parents' Best Career Advice
 
LinkedIn Q3 2015 Earnings Call
LinkedIn Q3 2015 Earnings CallLinkedIn Q3 2015 Earnings Call
LinkedIn Q3 2015 Earnings Call
 
LinkedIn Economic Graph Research: Toronto
LinkedIn Economic Graph Research: TorontoLinkedIn Economic Graph Research: Toronto
LinkedIn Economic Graph Research: Toronto
 
Freelancers Are LinkedIn Power Users [Infographic]
Freelancers Are LinkedIn Power Users [Infographic]Freelancers Are LinkedIn Power Users [Infographic]
Freelancers Are LinkedIn Power Users [Infographic]
 
Top Industries for Freelancers on LinkedIn [Infographic]
Top Industries for Freelancers on LinkedIn [Infographic]Top Industries for Freelancers on LinkedIn [Infographic]
Top Industries for Freelancers on LinkedIn [Infographic]
 
LinkedIn Quiz: Which Parent Are You When It Comes to Helping Guide Your Child...
LinkedIn Quiz: Which Parent Are You When It Comes to Helping Guide Your Child...LinkedIn Quiz: Which Parent Are You When It Comes to Helping Guide Your Child...
LinkedIn Quiz: Which Parent Are You When It Comes to Helping Guide Your Child...
 
LinkedIn Connect to Opportunity™ -- Stories of Discovery
LinkedIn Connect to Opportunity™ -- Stories of DiscoveryLinkedIn Connect to Opportunity™ -- Stories of Discovery
LinkedIn Connect to Opportunity™ -- Stories of Discovery
 
LinkedIn Q2 2015 Earnings Call
LinkedIn Q2 2015 Earnings CallLinkedIn Q2 2015 Earnings Call
LinkedIn Q2 2015 Earnings Call
 

Recently uploaded

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
LizaNolte
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 

Recently uploaded (20)

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 

How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale

  • 1. JDBC – We don’t need no stinking JDBC. How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale. David Raccah & Dhananjay Ragade LinkedIn Corporation
  • 2. Goal of this Presentation What you will learn How LinkedIn built a cheap and scalable system to store our member’s profiles, and how you can do the same 2
  • 3. Agenda > Review system ilities > What happened to databases? > SOA What > Discuss existing Best Practices > Pixie Dust and Kool-Aid are not so bad > What LinkedIn’s got up their sleeve > How it all came together… > Q&A 3
  • 4. Terminology of the ilities the terms of large successful systems > Performance  Not an “ility” but without it, no ility will save you > Availability  Availability is the proportion of time a system is in a functioning condition > Reliability  The probability that a functional unit will perform its required function for a specified interval under stated conditions.  The ability of something to "fail well" (fail without catastrophic consequences) 4
  • 5. Terminology of the ilities the terms of large successful systems > Scalability  Slow with multiple users vs. single user > Manageability  The ability to manage all parts of a large moving system > Serviceability  The ability to service an arm of the system without bleeding to death (e.g. change out a database from a working system). Bleeding is OK in a high performance system – death is NOT acceptable. 5
  • 6. Agenda > Review system ilities > What happened to databases? > SOA What > Discuss existing Best Practices > Pixie Dust and Kool-Aid are not so bad > What LinkedIn’s got up their sleeve > How it all came together… > Q&A 6
  • 7. Databases The systems that drive the enterprise … or…. > RDBMS – Relational Data Base Management System Attribute > KVSS – Key Value Storage System > Enterprise Search Engines 7
  • 9. Database mind set has changed… From data access to data management to…. > Initially it was all about remote data access with an index > Then it moved to ACID data management and tooling > Then it became an Application Server with data affinity > Now we have come full circle and people have figured out that scaling is more important than relationships, transactions, and data and behavioral affinity. 9
  • 10. Database Mantra that Rule the Roost ACID > Atomicity – All or nothing > Consistency – Data in the system should never get in a contradictory state. > Isolation: Two requests cannot interfere with one another. > Durability: No do over – once the data is persisted, it cannot change. 10
  • 11. Anti-Database Rules BASE > Basically Available  Support partial failures within your architecture (e.g. sharding) > Soft state  State may be out of synch for some time > Eventually consistent  Eventually all data is made consistent (as long as the hardware is reliable) 11
  • 12. Database Scalability Or lack thereof… > Databases work. Look at:  Hotmail  Facebook  eBay > Databases scale with hardware > They do not scale horizontally well  Partition management is nonexistent and RYO is a mess  Many use them as ISAM and not even relational 12
  • 13. Database Tools and language Duh… > Defacto standards for tools and languages abound for relational databases > Easy to manage the data within a partition and easy to write code to operate on said data > Terrifying but nice to use extensions include running Java within the Data Engine, so that you could run your application within the big iron 13
  • 14. Database’s other features Which are the pain points…. > Constraints – Nice idea until you start partitioning. 2PC is the anti-scalability pattern (Pat Helland) > Computation – this feature turns out to cause more pain as cost rises with scale and are incompatible with most languages and tools. > Replication & backup  Nice tools that are indeed important and useful > ACL support & Data Engine optimizations  Used for sure, but exist to circumvent deficiencies 14
  • 15. Key Value Storage Systems BigTable, Hive, Dynamo– the Wild Wild West > Reliable – Proven on web > Available – redundant (locally) > Scalable – no constraints > Limited ACIDity > No Standard and not portable > Almost no:  Constraints or relationships  Computation or transactions 15
  • 16. Enterprise Search Engines Index yes – storage device no > A great inverted index > Finds data quickly > However, what it returns is commonly an ID to the entity(s) in question > Real-Time solutions are available but not fully deployed today > Limited ACIDity/transactions > Scalable, available, reliable 16
  • 17. Agenda > Review system ilities > What happened to databases? > SOA What > Discuss existing Best Practices > Pixie Dust and Kool-Aid are not so bad > What LinkedIn’s got up their sleeve > How it all came together… > Q&A 17
  • 18. SOA Service Oriented Architecture > SOA may be overkill for most enterprises > Still a Tiered and layered architecture – which is what SOA hoped to formulate and standardize is a solid approach > Services (not SOA) allow for efficient reuse of business processes and aggregation services within a complex development organization 18
  • 19. Agenda > Review system ilities > What happened to databases? > SOA What > Discuss existing Best Practices > Pixie Dust and Kool-Aid are not so bad > What LinkedIn’s got up their sleeve > How it all came together… > Q&A 19
  • 20. Best Practices Storage and architecture > Store critical data redundantly and reliably with a cluster  Google via BigTable, Facebook via MySQL, eBay via replicated & sharded DB > Layer services on top of the storage device to manage data integrity and complexity  LinkedIn, Amazon, eBay 20
  • 21. Best Practices Storage and architecture > Create a bus to route replicated data to consumers – e.g. search, data mining, etc.  Almost all sites > Parallelization via things like scatter/gather  Almost all search topologies (Google, Yahoo, Live),  Facebook, etc. 21
  • 22. Best Practices Storage and architecture > Keep the system stateless  eBay, Google, etc. > Partition data and services  Facebook, eBay > Cache data > Replicate your data > Route requests to where the behavior and/or data exists > Degrade gracefully with load 22
  • 23. Best Practices Storage and architecture > Tiering systems  Latency vs. Affinity  Traversal versus affinity – you need to analyze the cost and make a decision  Scaling vs. parallelizing  Do you need to keep tiering all systems to keep the scalability uniform?  Complexity vs. diminished dependencies  Does the reduced dependencies make up for the increased system complexity? 23
  • 24. Agenda > Review system ilities > What happened to databases? > SOA What > Discuss existing Best Practices > Pixie Dust and Kool-Aid are not so bad > What LinkedIn’s got up their sleeve > How it all came together… > Q&A 24
  • 25. Pixie Dust and Kool-Aid Building on the past 25
  • 26. Pixie Dust and Kool-Aid Building on the past > So what do we want:  Reliable  Available  Scalable  ACIDity on simple transactions  Standard and portable interface  Data Optimizations  Cache and replicate  Low cost BASE architecture 26
  • 27. Agenda > Review system ilities > What happened to databases? > SOA What > Discuss existing Best Practices > Pixie Dust and Kool-Aid are not so bad > What LinkedIn’s got up their sleeve > How it all came together… > Q&A 27
  • 28. LinkedIn’s Data Services Mixture of standards and pixie dust > Front a database with a service > Cache data > Route to and partition the data service > Scale and replicate services in a horizontal manner > Keep all writes ACID and subsequent reads ACID as well 28
  • 29. LinkedIn’s Data Services Mixture of standards and pixie dust > Databases are reliable > Scale out at the service > Replicate and cache > Partitioning comes from the front tier and business servers that front the data services 29
  • 30. LinkedIn’s Data Services Immediate replication vs. eventual replication > Caching needs a consistency algorithm > Techniques for immediate replication  Paxos  Chubby, Microsoft AutoPilot, Zoo Keeper  N Phase Commit (2PC and 3PC) > Techniques for eventual consistency  BASE (Basically Available, Soft-state, Eventual Consistency  Inktomi, Dynamo, AWS 30
  • 31. LinkedIn’s Data Services LinkedIn’s approach > Keep core data ACID > Keep replicated and cached data BASE > Replicate data via the data bus > Cache data on a cheap memory (memcached) > Use a hint to route the client to his / her’s ACID data 31
  • 32. LinkedIn’s Data Services Databus – the linchpin of our replication 32
  • 34. LinkedIn’s Data Services Core DS > Keep core data ACID in the DB > All writes come here. > Databus source for all replication > The last line of defense for a cache miss > Manages sharding 34
  • 35. LinkedIn’s Data Services RepDS > Manages cache consistency and replication > Manages the freshness of the caller > Reads come from cache 35
  • 36. LinkedIn’s Data Services RepReader > RepReader is the typical tip of the iceberg problem > All read operations are sourced from the cache unless the caller’s freshness token is out of the window 36
  • 37. LinkedIn’s Data Services Freshness Token (AKA Pixie Dust) > The freshness token = Pixie Dust for CUD operations > It also allows us to give the caller control over whether they are content with BASE data, even if they did no CUD operation. 37
  • 38. LinkedIn’s Data Services For the love of Pixie dust and Kool-Aid > We use commodity hardware and software to run our service > We use Pixie Dust to keep costs down and keep our customer happy > We keep OPS and the exec-staff happy with our special brand of Kool- Aid 38
  • 39. Agenda > Review system ilities > What happened to databases? > SOA What > Discuss existing Best Practices > Pixie Dust and Kool-Aid are not so bad > What LinkedIn’s got up their sleeve > How it all came together… > Q&A 39
  • 40. Profile Re-architecture Changing planes in mid-flight > Original LinkedIn System > Use of XML for i18n > Phased Transition 40
  • 41. Problems from the original system Anthropology 101 > Be fair… it worked well for a startup > Many tables in one big DB > Too many similar object hierarchies > No well defined domains 41
  • 42. Why XML? Flexibility > Profile has many fields > 1NF for I18n ==> too many tables > StAX for fast parsing > Easier to version the profile > Human readable > JSON? ProtoBuf? 42
  • 43. Issues with XML <good/> <bad/> <ugly/> > XML schema design tradeoffs and analytics impact > XML is verbose > StAX is unfriendly > XML in the DB caused us some performance headaches 43
  • 44. Phased Transition Evolving a living, breathing organism > Successive iterations avoid breakages > No major site downtime > Easier to sanity check > Does not hold other teams hostage > Phases LinkedIn went through 44
  • 46. After Legacy Tables Dropped Auld Lang Syne 46
  • 47. Wrap up The moral of the story is… > Keep your system BASE > Use commodity hardware > Use pixie dust (AKA data freshness token) > Evolve slowly - no big bang! 47
  • 48. Q&A 48
  • 49. David Raccah & Dhananjay Ragade draccah@linkedin.com dragade@linkedin.com Linkedin Corporation 49
  • 51. Performance Often mixed up with scalability > Performance  A numerical value given to a single system when asked to do a task under nominal load  If the system responds poorly without load, it will assuredly continue its molasses response time under load 51
  • 52. Availability Often mixed up with reliability > Availability  A numerical value given to a system that defines the proportion of time a system is in a functioning condition.  Most common scoring system is called nines – which is defined as the uptime versus the uptime and downtime – five nines = 0.99999 52
  • 53. Reliability The ability for a system to perform its functionality > Reliability  A system can be 100% available and still be 100% unreliable (e.g. non consistent caching)  A person can consistently give you the wrong answer  Architecture is defined as the balance of the ilities and cost 53
  • 54. Scalability the term that many think is the holy grail > Scalability  The ability for a system to manage more traffic or to be “scaled” as more traffic appears  System slows with multiple users vs. single user  Route, Partition, Orchestrate, replicate, and go asynch  Split the system horizontally  Rarely scale vertically 54
  • 55. The rest of the ilities the ones that people tend to ignore till its too late > Manageability  It is a double-edged sword which can be easily ignored > Serviceability  Here complexity starts to rear its ugly head > Maintainability  Of course maintainability tends to run upstream of complexity 55