Scaling the Web: Databases & NoSQL

Richard Schneeman
Richard SchneemanSoftware Developer at Everywhere
Scaling the Web:
Databases &
NoSQL


Richard Schneeman              Wed Nov 10
@schneems works for @Gowalla         2011
whoami
• @Schneems
• BSME with Honors from Georgia Tech
• 5 + years experience Ruby & Rails
  • Work for @Gowalla
• Rails 3.1 contributor : )
• 3 + years technical teaching
Traffic
Compounding Traffic
          ex. Wikipedia
Compounding Traffic
          ex. Wikipedia
Gowalla
Gowalla
• 50 best websites NYTimes 2010
• Founded 2009 @ SXSW
• 1 million+ Users
  • Undisclosed Visitors
• Loves/highlights/comments/stories/guides
• Facebook/Foursquare/Twitter integration
• iphone/android/web apps
• public API
Scaling the Web: Databases & NoSQL
Gowalla Backend
• Ruby on Rails
  • Uses the Ruby Language
  • Rails is the Framework
The Web is Data
• Username => String
• Birthday => Int/ Int/ Int
• Blog Post => Text
• Image => Binary-file/blob

  Data needs to be stored
  to be useful
Database
Gowalla Database
• PostgreSQL
  • Relational (RDBMS)
  • Open Source
  • Competitor to MySQL
  • ACID compliant
• Running on a Dedicated Managed Server
Need for Speed
• Throughput:
  • The number of operations per minute that
    can be performed


• Pure Speed:
  • How long an individual operation takes.
Potential Problems
• Hardware
  • Slow Network
  • Slow hard-drive
  • Insufficient CPU
  • Insufficient Ram
• Software
  • too many Reads
  • too many Writes
Scaling Up versus Out
• Scale Up:
  • More CPU, Bigger HD, More Ram etc.
• Scale Out:
  • More machines
  • More machines
  • More machines
  • ...
Scale Up
• Bigger faster machine
  • More Ram
  • More CPU
  • Bigger ethernet bus
  • ...
• Moores Law
• Diminishing returns
Scale Out
• Forget Moores law...
• Add more nodes
  • Master/ Slave Database
  • Sharding
Master/Slave
                Write

                Master DB

                 Copy
  Slave DB   Slave DB   Slave DB   Slave DB




                 Read
Master & Slave +/-
• Pro
  • Increased read speed
  • Takes read load off of master
  • Allows us to Join across all tables
• Con
  • Doesn’t buy increased write throughput
  • Single Point of Failure in Master Node
Sharding
                  Write


  Users in   Users in   Users in   Users in
   USA       Europe      Asia       Africa



                  Read
Sharding +/-
• Pro
  • Increased Write & Read throughput
  • No Single Point of failure
    • Individual features can fail
• Con
  • Cannot Join queries between shards
What is a Database?
• Relational Database Managment System
  (RDBMS)
• Stores Data Using Schema
• A.C.I.D. compliant
  • Atomic
  • Consistent
  • Isolated
  • Durable
RDBMS
• Relational
  • Matches data on common characteristics
    in data
  • Enables “Join” & “Union” queries
• Makes data modular
Relational +/-
• Pros
  • Data is modular
  • Highly flexible data layout
• Cons
  • Getting desired data can be tricky
  • Over modularization leads to many join
    queries
  • Trade off performance for search-ability
Schema Storage
• Blueprint for data storage
• Break data into tables/columns/rows
• Give data types to your data
  • Integer
  • String
  • Text
  • Boolean
  • ...
Schema +/-
• Pros
  • Regularize our data
  • Helps keep data consistent
  • Converts to programming “types” easily
• Cons
  • Must seperatly manage schema
  • Adding columns & indexes to existing
    large tables can be painful & slow
ACID
• Properties that guarante a reliably
  transaction are processed
                              database

  • Atomic
  • Consistent
  • Isolated
  • Durable
ACID
• Atomic
• Any database Transaction is all or nothing.
• If one part of the transaction fails it all fails


“An Incomplete Transaction Cannot Exist”
ACID
• Consistent
• Any transaction will take the another
  from one consistent state to
                                database




 “Only Consistent data is allowed to be
                written”
ACID
• Isolated
• No transaction should be able to interfere
  with another transaction

“the same field cannot be updated by two
     sources at the exact same time”



                   }
         a = 0
         a += 1          a = ??
         a += 2
ACID
• Durable
• Onceway
  that
       a transaction Is committed it will stay




      “Save it once, read it forever”
What is a Database?
• RDBMS
  • Relational
  • Flexible
  • Has a schema
  • Most likely ACID compliant
  • Typically fast under low load or when
    optimized
What is SQL?
  • Structured Query Language
  • The language databases speak
  • Based on relational algebra
    • Insert
    • Query
    • Update
    • Delete
“SELECT Company, Country FROM Customers
         WHERE Country = 'USA' ”
Why people <3 SQL
• Relational algebra is powerful
• SQL is proven
  • well understood
  • well documented
Why people </3 SQL
• Relational algebra Is hard
• Different databases support different SQL
  syntax
• Yet another programming language to learn
SQL != Database
• SQL is used to talk to a RDBMS (database)
• SQL is not a RDBMS
What is NoSQL?

  Not A
  Relational
  Database
RDBMS
Types of NoSQL
• Distributed Systems
• Document Store
• Graph Database
• Key-Value Store
• Eventually Consistent Systems

             Mix And Match ↑
Key Value Stores
• Non Relational
• Typically No Schema
• Map one Key (a string) to a Value (some
  object)




         Example: Redis
Key Value Example
redis = Redis.new

redis.set(“foo”, “bar”)

redis.get(“foo”)

>> “bar”
Key Value Example
redis = Redis.new
           Key      Value
redis.set(“foo”, “bar”)
           Key
redis.get(“foo”)
   Value
>> “bar”
Key Value
  • Like a databse that can only ever use
    primary Key (id)

YES
select * from users where id = ‘3’;

NO
select * from users where name = ‘schneems’;
NoSQL @ Gowalla
• Redis (key-value store)
  • Store “Likes” & Analytics
• Memcache (key-value store)
  • Cache Database results
• Cassandra
  • (eventually consistent, with-schema, key
    value store)
  • Store “feeds” or “timelines”
• Solr (search index)
Memcache
• Key-Value Store
• Open Source
• Distributed
• In memory (ram) only
  • fast, but volatile
  • Not ACID
• Memory object caching system
Memcache Example
memcache = Memcache.new

memcache.set(“foo”, “bar”)

memcache.get(“foo”)

>> “bar”
Memcache
  • Can store whole objects
memcache = Memcache.new
user = User.where(:username => “schneems”)
memcache.set(“user:3”, user)

user_from_cache = memcache.get(“user:3”)
user_from_cache == user
>> true
user_from_cache.username
>> “Schneems”
Memcache @ Gowalla
• Cache Common Queries
  • Decreases Load on DB (postgres)
    • Enables higher throughput from DB
  • Faster response than DB
    • Users see quicker page load time
What to Cache?
• Objects that change infrequently
  • users
  • spots (places)
  • etc.
• Expensive(ish) sql queries
  • Friend ids for users
  • User ids for people visiting spots
  • etc.
Memcache Distributed

              A




                   C
          B
Memcache Distributed
          Easily add more nodes


          A          D




          B         C
Memcache <3’s DB
• We use them Together
• If memcache doesn’t have a value
  • Fetch from the database
  • Set the key from database
• Hard
  • Cache Invalidation : (
Redis
• Key Value Store
• Open Source
• Not Distributed (yet)
• Extremely Quick
• “Data structure server”
Redis Example, again
redis = Redis.new

redis.set(“foo”, “bar”)

redis.get(“foo”)

>> “bar”
Redis - Has Data Types
• Strings
• Hashes
• Lists
• Sets
• Sorted Sets
Redis Example, sets
redis = Redis.new
redis.sadd(“foo”, “bar”)
redis.members(“foo”)
>> [“bar”]
redis.sadd(“foo”, “fly”)
redis.members(“foo”)
>> [“bar”, “fly”]
Redis => Likeable
• Very Fast response
• ~ 50 queries per page view
  • ~ 1 ms per query
• http://github.com/Gowalla/likeable
Cassandra
• Open Source
• Distributed
• Key Value Store
• Eventually Consistent
  • Sortof not ACID
• Uses A Schema
  • ColumnFamilies
Cassandra Distributed
           Eventual Consistency


           A          D
                          Copied To
                          Extra
                          Nodes ...
                          Eventually
 Data In   B         C
Cassandra




            {
@ Gowalla

 Activity
 Feeds
Cassandra @ Gowalla
• Chronologic
• http://github.com/Gowalla/chronologic
Should I use
NoSQL?
Which One?
Pick the
right tool
Tradeoffs
• Every Data store has them
• Know your data store
  • Strengths
  • Weaknesses
NoSQL vs. RDBMS
• No Magic Bullet
• Use Both!!!
• Model data in a datastore you understand
  • Switch to when/if you need to
• Understand Your Options
Questions?




Richard Schneeman
@schneems works for @Gowalla
1 of 67

Recommended

NoSql - mayank singh by
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singhMayank Singh
213 views30 slides
The Rise of NoSQL and Polyglot Persistence by
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceAbdelmonaim Remani
19.6K views54 slides
Introduction to CosmosDB - Azure Bootcamp 2018 by
Introduction to CosmosDB - Azure Bootcamp 2018Introduction to CosmosDB - Azure Bootcamp 2018
Introduction to CosmosDB - Azure Bootcamp 2018Josh Carlisle
183 views14 slides
What is NoSQL and CAP Theorem by
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
21.7K views15 slides
Mongo db groundup-0-nosql-intro-syedawasekhirni by
Mongo db groundup-0-nosql-intro-syedawasekhirniMongo db groundup-0-nosql-intro-syedawasekhirni
Mongo db groundup-0-nosql-intro-syedawasekhirniDr. Awase Khirni Syed
729 views61 slides
noSQL choices by
noSQL choicesnoSQL choices
noSQL choiceslugiamaster4
296 views113 slides

More Related Content

What's hot

Azure DocumentDB 101 by
Azure DocumentDB 101Azure DocumentDB 101
Azure DocumentDB 101Ike Ellis
1.6K views36 slides
NoSql by
NoSqlNoSql
NoSqlGirish Khanzode
2.6K views164 slides
NoSQL by
NoSQLNoSQL
NoSQLdbulic
1.4K views53 slides
NoSQL: Why, When, and How by
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
6.2K views123 slides
Dropping ACID: Wrapping Your Mind Around NoSQL Databases by
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesKyle Banerjee
657 views21 slides
Selecting best NoSQL by
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL Mohammed Fazuluddin
15.2K views23 slides

What's hot(20)

Azure DocumentDB 101 by Ike Ellis
Azure DocumentDB 101Azure DocumentDB 101
Azure DocumentDB 101
Ike Ellis1.6K views
NoSQL by dbulic
NoSQLNoSQL
NoSQL
dbulic1.4K views
NoSQL: Why, When, and How by BigBlueHat
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
BigBlueHat6.2K views
Dropping ACID: Wrapping Your Mind Around NoSQL Databases by Kyle Banerjee
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Kyle Banerjee657 views
Plmce2012 scaling pinterest by Mohit Jain
Plmce2012 scaling pinterestPlmce2012 scaling pinterest
Plmce2012 scaling pinterest
Mohit Jain2K views
Microsoft's Big Play for Big Data by Andrew Brust
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
Andrew Brust913 views
Compare DynamoDB vs. MongoDB by Amar Das
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
Amar Das9.4K views
NoSql Data Management by sameerfaizan
NoSql Data ManagementNoSql Data Management
NoSql Data Management
sameerfaizan336 views
SQL vs. NoSQL. It's always a hard choice. by Denis Reznik
SQL vs. NoSQL. It's always a hard choice.SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.
Denis Reznik375 views
Introduction to NoSQL Databases by Derek Stainer
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer47.7K views
Introduction to Azure DocumentDB by Ike Ellis
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Ike Ellis1.8K views

Viewers also liked

Making sense of the Graph Revolution by
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph RevolutionInfiniteGraph
1.1K views36 slides
Benchmarking, Load Testing, and Preventing Terrible Disasters by
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersMongoDB
789 views32 slides
SQL vs. NoSQL Databases by
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesOsama Jomaa
2.8K views22 slides
SQL vs. NoSQL by
SQL vs. NoSQLSQL vs. NoSQL
SQL vs. NoSQLGuido Schmutz
3.2K views48 slides
Big Data by
Big DataBig Data
Big DataNeha Mehta
7.8K views43 slides
Sql vs NoSQL by
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
114.6K views17 slides

Viewers also liked(10)

Making sense of the Graph Revolution by InfiniteGraph
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph Revolution
InfiniteGraph1.1K views
Benchmarking, Load Testing, and Preventing Terrible Disasters by MongoDB
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible Disasters
MongoDB789 views
SQL vs. NoSQL Databases by Osama Jomaa
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL Databases
Osama Jomaa2.8K views
Sql vs NoSQL by RTigger
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
RTigger114.6K views
NoSQL Databases: Why, what and when by Lorenzo Alberton
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
Lorenzo Alberton182.4K views
A Beginners Guide to noSQL by Mike Crabb
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
Mike Crabb183.9K views

Similar to Scaling the Web: Databases & NoSQL

Revision by
RevisionRevision
RevisionDavid Sherlock
352 views35 slides
Big Data (NJ SQL Server User Group) by
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
1.1K views53 slides
UNIT I Introduction to NoSQL.pptx by
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
12 views29 slides
Oracle Week 2016 - Modern Data Architecture by
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureArthur Gimpel
222 views27 slides
KeyValue Stores by
KeyValue StoresKeyValue Stores
KeyValue StoresMauro Pompilio
3.8K views37 slides
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully by
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullyMd Kamaruzzaman
126 views25 slides

Similar to Scaling the Web: Databases & NoSQL(20)

Big Data (NJ SQL Server User Group) by Don Demcsak
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
Don Demcsak1.1K views
UNIT I Introduction to NoSQL.pptx by Rahul Borate
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate12 views
Oracle Week 2016 - Modern Data Architecture by Arthur Gimpel
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
Arthur Gimpel222 views
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully by Md Kamaruzzaman
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
Md Kamaruzzaman126 views
Survey of the Microsoft Azure Data Landscape by Ike Ellis
Survey of the Microsoft Azure Data LandscapeSurvey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data Landscape
Ike Ellis1.3K views
Build a modern data platform.pptx by Ike Ellis
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptx
Ike Ellis452 views
HBase in Practice by larsgeorge
HBase in PracticeHBase in Practice
HBase in Practice
larsgeorge5.6K views
Scaling Databases On The Cloud by Imaginea
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
Imaginea1K views
Scaing databases on the cloud by Imaginea
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
Imaginea505 views
Austin NoSQL 2011-07-06 by jimbojsb
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06
jimbojsb495 views
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ... by Fwdays
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Fwdays515 views
Technologies for Data Analytics Platform by N Masahiro
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
N Masahiro9.2K views

More from Richard Schneeman

Rails 3 Beginner to Builder 2011 Week 8 by
Rails 3 Beginner to Builder 2011 Week 8Rails 3 Beginner to Builder 2011 Week 8
Rails 3 Beginner to Builder 2011 Week 8Richard Schneeman
2.3K views46 slides
Rails 3 Beginner to Builder 2011 Week 6 by
Rails 3 Beginner to Builder 2011 Week 6Rails 3 Beginner to Builder 2011 Week 6
Rails 3 Beginner to Builder 2011 Week 6Richard Schneeman
752 views43 slides
Rails 3 Beginner to Builder 2011 Week 5 by
Rails 3 Beginner to Builder 2011 Week 5Rails 3 Beginner to Builder 2011 Week 5
Rails 3 Beginner to Builder 2011 Week 5Richard Schneeman
830 views45 slides
Rails 3 Beginner to Builder 2011 Week 4 by
Rails 3 Beginner to Builder 2011 Week 4Rails 3 Beginner to Builder 2011 Week 4
Rails 3 Beginner to Builder 2011 Week 4Richard Schneeman
1.1K views54 slides
Rails 3 Beginner to Builder 2011 Week 3 by
Rails 3 Beginner to Builder 2011 Week 3Rails 3 Beginner to Builder 2011 Week 3
Rails 3 Beginner to Builder 2011 Week 3Richard Schneeman
2K views42 slides
Rails 3 Beginner to Builder 2011 Week 2 by
Rails 3 Beginner to Builder 2011 Week 2Rails 3 Beginner to Builder 2011 Week 2
Rails 3 Beginner to Builder 2011 Week 2Richard Schneeman
1.9K views44 slides

More from Richard Schneeman(13)

Rails 3 Beginner to Builder 2011 Week 8 by Richard Schneeman
Rails 3 Beginner to Builder 2011 Week 8Rails 3 Beginner to Builder 2011 Week 8
Rails 3 Beginner to Builder 2011 Week 8
Richard Schneeman2.3K views
Rails 3 Beginner to Builder 2011 Week 4 by Richard Schneeman
Rails 3 Beginner to Builder 2011 Week 4Rails 3 Beginner to Builder 2011 Week 4
Rails 3 Beginner to Builder 2011 Week 4
Richard Schneeman1.1K views
Rails 3 Beginner to Builder 2011 Week 2 by Richard Schneeman
Rails 3 Beginner to Builder 2011 Week 2Rails 3 Beginner to Builder 2011 Week 2
Rails 3 Beginner to Builder 2011 Week 2
Richard Schneeman1.9K views
Rails 3 Beginner to Builder 2011 Week 1 by Richard Schneeman
Rails 3 Beginner to Builder 2011 Week 1Rails 3 Beginner to Builder 2011 Week 1
Rails 3 Beginner to Builder 2011 Week 1
Richard Schneeman2.8K views

Recently uploaded

Spesifikasi Lengkap ASUS Vivobook Go 14 by
Spesifikasi Lengkap ASUS Vivobook Go 14Spesifikasi Lengkap ASUS Vivobook Go 14
Spesifikasi Lengkap ASUS Vivobook Go 14Dot Semarang
37 views1 slide
Vertical User Stories by
Vertical User StoriesVertical User Stories
Vertical User StoriesMoisés Armani Ramírez
12 views16 slides
Voice Logger - Telephony Integration Solution at Aegis by
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at AegisNirmal Sharma
31 views1 slide
Lilypad @ Labweek, Istanbul, 2023.pdf by
Lilypad @ Labweek, Istanbul, 2023.pdfLilypad @ Labweek, Istanbul, 2023.pdf
Lilypad @ Labweek, Istanbul, 2023.pdfAlly339821
9 views45 slides
Black and White Modern Science Presentation.pptx by
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptxmaryamkhalid2916
16 views21 slides
Special_edition_innovator_2023.pdf by
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdfWillDavies22
17 views6 slides

Recently uploaded(20)

Spesifikasi Lengkap ASUS Vivobook Go 14 by Dot Semarang
Spesifikasi Lengkap ASUS Vivobook Go 14Spesifikasi Lengkap ASUS Vivobook Go 14
Spesifikasi Lengkap ASUS Vivobook Go 14
Dot Semarang37 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma31 views
Lilypad @ Labweek, Istanbul, 2023.pdf by Ally339821
Lilypad @ Labweek, Istanbul, 2023.pdfLilypad @ Labweek, Istanbul, 2023.pdf
Lilypad @ Labweek, Istanbul, 2023.pdf
Ally3398219 views
Black and White Modern Science Presentation.pptx by maryamkhalid2916
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptx
maryamkhalid291616 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2217 views
DALI Basics Course 2023 by Ivory Egg
DALI Basics Course  2023DALI Basics Course  2023
DALI Basics Course 2023
Ivory Egg16 views
The details of description: Techniques, tips, and tangents on alternative tex... by BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada126 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker33 views
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10237 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada135 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
Web Dev - 1 PPT.pdf by gdsczhcet
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet60 views
AMAZON PRODUCT RESEARCH.pdf by JerikkLaureta
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdf
JerikkLaureta19 views

Scaling the Web: Databases & NoSQL

  • 1. Scaling the Web: Databases & NoSQL Richard Schneeman Wed Nov 10 @schneems works for @Gowalla 2011
  • 2. whoami • @Schneems • BSME with Honors from Georgia Tech • 5 + years experience Ruby & Rails • Work for @Gowalla • Rails 3.1 contributor : ) • 3 + years technical teaching
  • 4. Compounding Traffic ex. Wikipedia
  • 5. Compounding Traffic ex. Wikipedia
  • 7. Gowalla • 50 best websites NYTimes 2010 • Founded 2009 @ SXSW • 1 million+ Users • Undisclosed Visitors • Loves/highlights/comments/stories/guides • Facebook/Foursquare/Twitter integration • iphone/android/web apps • public API
  • 9. Gowalla Backend • Ruby on Rails • Uses the Ruby Language • Rails is the Framework
  • 10. The Web is Data • Username => String • Birthday => Int/ Int/ Int • Blog Post => Text • Image => Binary-file/blob Data needs to be stored to be useful
  • 12. Gowalla Database • PostgreSQL • Relational (RDBMS) • Open Source • Competitor to MySQL • ACID compliant • Running on a Dedicated Managed Server
  • 13. Need for Speed • Throughput: • The number of operations per minute that can be performed • Pure Speed: • How long an individual operation takes.
  • 14. Potential Problems • Hardware • Slow Network • Slow hard-drive • Insufficient CPU • Insufficient Ram • Software • too many Reads • too many Writes
  • 15. Scaling Up versus Out • Scale Up: • More CPU, Bigger HD, More Ram etc. • Scale Out: • More machines • More machines • More machines • ...
  • 16. Scale Up • Bigger faster machine • More Ram • More CPU • Bigger ethernet bus • ... • Moores Law • Diminishing returns
  • 17. Scale Out • Forget Moores law... • Add more nodes • Master/ Slave Database • Sharding
  • 18. Master/Slave Write Master DB Copy Slave DB Slave DB Slave DB Slave DB Read
  • 19. Master & Slave +/- • Pro • Increased read speed • Takes read load off of master • Allows us to Join across all tables • Con • Doesn’t buy increased write throughput • Single Point of Failure in Master Node
  • 20. Sharding Write Users in Users in Users in Users in USA Europe Asia Africa Read
  • 21. Sharding +/- • Pro • Increased Write & Read throughput • No Single Point of failure • Individual features can fail • Con • Cannot Join queries between shards
  • 22. What is a Database? • Relational Database Managment System (RDBMS) • Stores Data Using Schema • A.C.I.D. compliant • Atomic • Consistent • Isolated • Durable
  • 23. RDBMS • Relational • Matches data on common characteristics in data • Enables “Join” & “Union” queries • Makes data modular
  • 24. Relational +/- • Pros • Data is modular • Highly flexible data layout • Cons • Getting desired data can be tricky • Over modularization leads to many join queries • Trade off performance for search-ability
  • 25. Schema Storage • Blueprint for data storage • Break data into tables/columns/rows • Give data types to your data • Integer • String • Text • Boolean • ...
  • 26. Schema +/- • Pros • Regularize our data • Helps keep data consistent • Converts to programming “types” easily • Cons • Must seperatly manage schema • Adding columns & indexes to existing large tables can be painful & slow
  • 27. ACID • Properties that guarante a reliably transaction are processed database • Atomic • Consistent • Isolated • Durable
  • 28. ACID • Atomic • Any database Transaction is all or nothing. • If one part of the transaction fails it all fails “An Incomplete Transaction Cannot Exist”
  • 29. ACID • Consistent • Any transaction will take the another from one consistent state to database “Only Consistent data is allowed to be written”
  • 30. ACID • Isolated • No transaction should be able to interfere with another transaction “the same field cannot be updated by two sources at the exact same time” } a = 0 a += 1 a = ?? a += 2
  • 31. ACID • Durable • Onceway that a transaction Is committed it will stay “Save it once, read it forever”
  • 32. What is a Database? • RDBMS • Relational • Flexible • Has a schema • Most likely ACID compliant • Typically fast under low load or when optimized
  • 33. What is SQL? • Structured Query Language • The language databases speak • Based on relational algebra • Insert • Query • Update • Delete “SELECT Company, Country FROM Customers WHERE Country = 'USA' ”
  • 34. Why people <3 SQL • Relational algebra is powerful • SQL is proven • well understood • well documented
  • 35. Why people </3 SQL • Relational algebra Is hard • Different databases support different SQL syntax • Yet another programming language to learn
  • 36. SQL != Database • SQL is used to talk to a RDBMS (database) • SQL is not a RDBMS
  • 37. What is NoSQL? Not A Relational Database
  • 38. RDBMS
  • 39. Types of NoSQL • Distributed Systems • Document Store • Graph Database • Key-Value Store • Eventually Consistent Systems Mix And Match ↑
  • 40. Key Value Stores • Non Relational • Typically No Schema • Map one Key (a string) to a Value (some object) Example: Redis
  • 41. Key Value Example redis = Redis.new redis.set(“foo”, “bar”) redis.get(“foo”) >> “bar”
  • 42. Key Value Example redis = Redis.new Key Value redis.set(“foo”, “bar”) Key redis.get(“foo”) Value >> “bar”
  • 43. Key Value • Like a databse that can only ever use primary Key (id) YES select * from users where id = ‘3’; NO select * from users where name = ‘schneems’;
  • 44. NoSQL @ Gowalla • Redis (key-value store) • Store “Likes” & Analytics • Memcache (key-value store) • Cache Database results • Cassandra • (eventually consistent, with-schema, key value store) • Store “feeds” or “timelines” • Solr (search index)
  • 45. Memcache • Key-Value Store • Open Source • Distributed • In memory (ram) only • fast, but volatile • Not ACID • Memory object caching system
  • 46. Memcache Example memcache = Memcache.new memcache.set(“foo”, “bar”) memcache.get(“foo”) >> “bar”
  • 47. Memcache • Can store whole objects memcache = Memcache.new user = User.where(:username => “schneems”) memcache.set(“user:3”, user) user_from_cache = memcache.get(“user:3”) user_from_cache == user >> true user_from_cache.username >> “Schneems”
  • 48. Memcache @ Gowalla • Cache Common Queries • Decreases Load on DB (postgres) • Enables higher throughput from DB • Faster response than DB • Users see quicker page load time
  • 49. What to Cache? • Objects that change infrequently • users • spots (places) • etc. • Expensive(ish) sql queries • Friend ids for users • User ids for people visiting spots • etc.
  • 51. Memcache Distributed Easily add more nodes A D B C
  • 52. Memcache <3’s DB • We use them Together • If memcache doesn’t have a value • Fetch from the database • Set the key from database • Hard • Cache Invalidation : (
  • 53. Redis • Key Value Store • Open Source • Not Distributed (yet) • Extremely Quick • “Data structure server”
  • 54. Redis Example, again redis = Redis.new redis.set(“foo”, “bar”) redis.get(“foo”) >> “bar”
  • 55. Redis - Has Data Types • Strings • Hashes • Lists • Sets • Sorted Sets
  • 56. Redis Example, sets redis = Redis.new redis.sadd(“foo”, “bar”) redis.members(“foo”) >> [“bar”] redis.sadd(“foo”, “fly”) redis.members(“foo”) >> [“bar”, “fly”]
  • 57. Redis => Likeable • Very Fast response • ~ 50 queries per page view • ~ 1 ms per query • http://github.com/Gowalla/likeable
  • 58. Cassandra • Open Source • Distributed • Key Value Store • Eventually Consistent • Sortof not ACID • Uses A Schema • ColumnFamilies
  • 59. Cassandra Distributed Eventual Consistency A D Copied To Extra Nodes ... Eventually Data In B C
  • 60. Cassandra { @ Gowalla Activity Feeds
  • 61. Cassandra @ Gowalla • Chronologic • http://github.com/Gowalla/chronologic
  • 65. Tradeoffs • Every Data store has them • Know your data store • Strengths • Weaknesses
  • 66. NoSQL vs. RDBMS • No Magic Bullet • Use Both!!! • Model data in a datastore you understand • Switch to when/if you need to • Understand Your Options