SlideShare a Scribd company logo
Scaling the Web:
Databases &
NoSQL


Richard Schneeman              Wed Nov 10
@schneems works for @Gowalla         2011
whoami
• @Schneems
• BSME with Honors from Georgia Tech
• 5 + years experience Ruby & Rails
  • Work for @Gowalla
• Rails 3.1 contributor : )
• 3 + years technical teaching
Traffic
Compounding Traffic
          ex. Wikipedia
Compounding Traffic
          ex. Wikipedia
Gowalla
Gowalla
• 50 best websites NYTimes 2010
• Founded 2009 @ SXSW
• 1 million+ Users
  • Undisclosed Visitors
• Loves/highlights/comments/stories/guides
• Facebook/Foursquare/Twitter integration
• iphone/android/web apps
• public API
Gowalla Backend
• Ruby on Rails
  • Uses the Ruby Language
  • Rails is the Framework
The Web is Data
• Username => String
• Birthday => Int/ Int/ Int
• Blog Post => Text
• Image => Binary-file/blob

  Data needs to be stored
  to be useful
Database
Gowalla Database
• PostgreSQL
  • Relational (RDBMS)
  • Open Source
  • Competitor to MySQL
  • ACID compliant
• Running on a Dedicated Managed Server
Need for Speed
• Throughput:
  • The number of operations per minute that
    can be performed


• Pure Speed:
  • How long an individual operation takes.
Potential Problems
• Hardware
  • Slow Network
  • Slow hard-drive
  • Insufficient CPU
  • Insufficient Ram
• Software
  • too many Reads
  • too many Writes
Scaling Up versus Out
• Scale Up:
  • More CPU, Bigger HD, More Ram etc.
• Scale Out:
  • More machines
  • More machines
  • More machines
  • ...
Scale Up
• Bigger faster machine
  • More Ram
  • More CPU
  • Bigger ethernet bus
  • ...
• Moores Law
• Diminishing returns
Scale Out
• Forget Moores law...
• Add more nodes
  • Master/ Slave Database
  • Sharding
Master/Slave
                Write

                Master DB

                 Copy
  Slave DB   Slave DB   Slave DB   Slave DB




                 Read
Master & Slave +/-
• Pro
  • Increased read speed
  • Takes read load off of master
  • Allows us to Join across all tables
• Con
  • Doesn’t buy increased write throughput
  • Single Point of Failure in Master Node
Sharding
                  Write


  Users in   Users in   Users in   Users in
   USA       Europe      Asia       Africa



                  Read
Sharding +/-
• Pro
  • Increased Write & Read throughput
  • No Single Point of failure
    • Individual features can fail
• Con
  • Cannot Join queries between shards
What is a Database?
• Relational Database Managment System
  (RDBMS)
• Stores Data Using Schema
• A.C.I.D. compliant
  • Atomic
  • Consistent
  • Isolated
  • Durable
RDBMS
• Relational
  • Matches data on common characteristics
    in data
  • Enables “Join” & “Union” queries
• Makes data modular
Relational +/-
• Pros
  • Data is modular
  • Highly flexible data layout
• Cons
  • Getting desired data can be tricky
  • Over modularization leads to many join
    queries
  • Trade off performance for search-ability
Schema Storage
• Blueprint for data storage
• Break data into tables/columns/rows
• Give data types to your data
  • Integer
  • String
  • Text
  • Boolean
  • ...
Schema +/-
• Pros
  • Regularize our data
  • Helps keep data consistent
  • Converts to programming “types” easily
• Cons
  • Must seperatly manage schema
  • Adding columns & indexes to existing
    large tables can be painful & slow
ACID
• Properties that guarante a reliably
  transaction are processed
                              database

  • Atomic
  • Consistent
  • Isolated
  • Durable
ACID
• Atomic
• Any database Transaction is all or nothing.
• If one part of the transaction fails it all fails


“An Incomplete Transaction Cannot Exist”
ACID
• Consistent
• Any transaction will take the another
  from one consistent state to
                                database




 “Only Consistent data is allowed to be
                written”
ACID
• Isolated
• No transaction should be able to interfere
  with another transaction

“the same field cannot be updated by two
     sources at the exact same time”



                   }
         a = 0
         a += 1          a = ??
         a += 2
ACID
• Durable
• Onceway
  that
       a transaction Is committed it will stay




      “Save it once, read it forever”
What is a Database?
• RDBMS
  • Relational
  • Flexible
  • Has a schema
  • Most likely ACID compliant
  • Typically fast under low load or when
    optimized
What is SQL?
  • Structured Query Language
  • The language databases speak
  • Based on relational algebra
    • Insert
    • Query
    • Update
    • Delete
“SELECT Company, Country FROM Customers
         WHERE Country = 'USA' ”
Why people <3 SQL
• Relational algebra is powerful
• SQL is proven
  • well understood
  • well documented
Why people </3 SQL
• Relational algebra Is hard
• Different databases support different SQL
  syntax
• Yet another programming language to learn
SQL != Database
• SQL is used to talk to a RDBMS (database)
• SQL is not a RDBMS
What is NoSQL?

  Not A
  Relational
  Database
RDBMS
Types of NoSQL
• Distributed Systems
• Document Store
• Graph Database
• Key-Value Store
• Eventually Consistent Systems

             Mix And Match ↑
Key Value Stores
• Non Relational
• Typically No Schema
• Map one Key (a string) to a Value (some
  object)




         Example: Redis
Key Value Example
redis = Redis.new

redis.set(“foo”, “bar”)

redis.get(“foo”)

>> “bar”
Key Value Example
redis = Redis.new
           Key      Value
redis.set(“foo”, “bar”)
           Key
redis.get(“foo”)
   Value
>> “bar”
Key Value
  • Like a databse that can only ever use
    primary Key (id)

YES
select * from users where id = ‘3’;

NO
select * from users where name = ‘schneems’;
NoSQL @ Gowalla
• Redis (key-value store)
  • Store “Likes” & Analytics
• Memcache (key-value store)
  • Cache Database results
• Cassandra
  • (eventually consistent, with-schema, key
    value store)
  • Store “feeds” or “timelines”
• Solr (search index)
Memcache
• Key-Value Store
• Open Source
• Distributed
• In memory (ram) only
  • fast, but volatile
  • Not ACID
• Memory object caching system
Memcache Example
memcache = Memcache.new

memcache.set(“foo”, “bar”)

memcache.get(“foo”)

>> “bar”
Memcache
  • Can store whole objects
memcache = Memcache.new
user = User.where(:username => “schneems”)
memcache.set(“user:3”, user)

user_from_cache = memcache.get(“user:3”)
user_from_cache == user
>> true
user_from_cache.username
>> “Schneems”
Memcache @ Gowalla
• Cache Common Queries
  • Decreases Load on DB (postgres)
    • Enables higher throughput from DB
  • Faster response than DB
    • Users see quicker page load time
What to Cache?
• Objects that change infrequently
  • users
  • spots (places)
  • etc.
• Expensive(ish) sql queries
  • Friend ids for users
  • User ids for people visiting spots
  • etc.
Memcache Distributed

              A




                   C
          B
Memcache Distributed
          Easily add more nodes


          A          D




          B         C
Memcache <3’s DB
• We use them Together
• If memcache doesn’t have a value
  • Fetch from the database
  • Set the key from database
• Hard
  • Cache Invalidation : (
Redis
• Key Value Store
• Open Source
• Not Distributed (yet)
• Extremely Quick
• “Data structure server”
Redis Example, again
redis = Redis.new

redis.set(“foo”, “bar”)

redis.get(“foo”)

>> “bar”
Redis - Has Data Types
• Strings
• Hashes
• Lists
• Sets
• Sorted Sets
Redis Example, sets
redis = Redis.new
redis.sadd(“foo”, “bar”)
redis.members(“foo”)
>> [“bar”]
redis.sadd(“foo”, “fly”)
redis.members(“foo”)
>> [“bar”, “fly”]
Redis => Likeable
• Very Fast response
• ~ 50 queries per page view
  • ~ 1 ms per query
• http://github.com/Gowalla/likeable
Cassandra
• Open Source
• Distributed
• Key Value Store
• Eventually Consistent
  • Sortof not ACID
• Uses A Schema
  • ColumnFamilies
Cassandra Distributed
           Eventual Consistency


           A          D
                          Copied To
                          Extra
                          Nodes ...
                          Eventually
 Data In   B         C
Cassandra




            {
@ Gowalla

 Activity
 Feeds
Cassandra @ Gowalla
• Chronologic
• http://github.com/Gowalla/chronologic
Should I use
NoSQL?
Which One?
Pick the
right tool
Tradeoffs
• Every Data store has them
• Know your data store
  • Strengths
  • Weaknesses
NoSQL vs. RDBMS
• No Magic Bullet
• Use Both!!!
• Model data in a datastore you understand
  • Switch to when/if you need to
• Understand Your Options
Questions?




Richard Schneeman
@schneems works for @Gowalla

More Related Content

What's hot

Azure DocumentDB 101
Azure DocumentDB 101Azure DocumentDB 101
Azure DocumentDB 101
Ike Ellis
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Kyle Banerjee
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
Mohammed Fazuluddin
 
Plmce2012 scaling pinterest
Plmce2012 scaling pinterestPlmce2012 scaling pinterest
Plmce2012 scaling pinterestMohit Jain
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big DataAndrew Brust
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
Amar Das
 
NoSql Data Management
NoSql Data ManagementNoSql Data Management
NoSql Data Management
sameerfaizan
 
Rdbms vs. no sql
Rdbms vs. no sqlRdbms vs. no sql
Rdbms vs. no sql
Amar Jagdale
 
How & When to Use NoSQL at Websummit Dublin
How & When to Use NoSQL at Websummit DublinHow & When to Use NoSQL at Websummit Dublin
How & When to Use NoSQL at Websummit Dublin
Amazon Web Services
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010
Gavin Heavyside
 
SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.
Denis Reznik
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
DataStax Academy
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Ike Ellis
 

What's hot (20)

Azure DocumentDB 101
Azure DocumentDB 101Azure DocumentDB 101
Azure DocumentDB 101
 
NoSql
NoSqlNoSql
NoSql
 
NoSQL
NoSQLNoSQL
NoSQL
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
 
Plmce2012 scaling pinterest
Plmce2012 scaling pinterestPlmce2012 scaling pinterest
Plmce2012 scaling pinterest
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
NoSql Data Management
NoSql Data ManagementNoSql Data Management
NoSql Data Management
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Rdbms vs. no sql
Rdbms vs. no sqlRdbms vs. no sql
Rdbms vs. no sql
 
How & When to Use NoSQL at Websummit Dublin
How & When to Use NoSQL at Websummit DublinHow & When to Use NoSQL at Websummit Dublin
How & When to Use NoSQL at Websummit Dublin
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010
 
SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 

Viewers also liked

Making sense of the Graph Revolution
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph Revolution
InfiniteGraph
 
Benchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible Disasters
MongoDB
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL Databases
Osama Jomaa
 
SQL vs. NoSQL
SQL vs. NoSQLSQL vs. NoSQL
SQL vs. NoSQL
Guido Schmutz
 
Big Data
Big DataBig Data
Big Data
Neha Mehta
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
RTigger
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
Lorenzo Alberton
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
Mike Crabb
 
Big data ppt
Big data pptBig data ppt
Big data ppt
IDBI Bank Ltd.
 

Viewers also liked (10)

Making sense of the Graph Revolution
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph Revolution
 
Benchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible Disasters
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL Databases
 
SQL vs. NoSQL
SQL vs. NoSQLSQL vs. NoSQL
SQL vs. NoSQL
 
Big Data
Big DataBig Data
Big Data
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similar to Scaling the Web: Databases & NoSQL

Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
Arthur Gimpel
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
Mauro Pompilio
 
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
Md Kamaruzzaman
 
Survey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data LandscapeSurvey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data Landscape
Ike Ellis
 
Build a modern data platform.pptx
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptx
Ike Ellis
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
larsgeorge
 
MongoDB SF Ruby
MongoDB SF RubyMongoDB SF Ruby
MongoDB SF Ruby
Mike Dirolf
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
Imaginea
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
Imaginea
 
Database Technologies
Database TechnologiesDatabase Technologies
Database Technologies
Michel de Goede
 
Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06jimbojsb
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Fwdays
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
IBM Cloud Data Services
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
N Masahiro
 

Similar to Scaling the Web: Databases & NoSQL (20)

Revision
RevisionRevision
Revision
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
 
Survey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data LandscapeSurvey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data Landscape
 
Build a modern data platform.pptx
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptx
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
MongoDB SF Ruby
MongoDB SF RubyMongoDB SF Ruby
MongoDB SF Ruby
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
 
Voldemort Nosql
Voldemort NosqlVoldemort Nosql
Voldemort Nosql
 
Database Technologies
Database TechnologiesDatabase Technologies
Database Technologies
 
Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 

More from Richard Schneeman

Rails 3 Beginner to Builder 2011 Week 8
Rails 3 Beginner to Builder 2011 Week 8Rails 3 Beginner to Builder 2011 Week 8
Rails 3 Beginner to Builder 2011 Week 8
Richard Schneeman
 
Rails 3 Beginner to Builder 2011 Week 6
Rails 3 Beginner to Builder 2011 Week 6Rails 3 Beginner to Builder 2011 Week 6
Rails 3 Beginner to Builder 2011 Week 6
Richard Schneeman
 
Rails 3 Beginner to Builder 2011 Week 5
Rails 3 Beginner to Builder 2011 Week 5Rails 3 Beginner to Builder 2011 Week 5
Rails 3 Beginner to Builder 2011 Week 5
Richard Schneeman
 
Rails 3 Beginner to Builder 2011 Week 4
Rails 3 Beginner to Builder 2011 Week 4Rails 3 Beginner to Builder 2011 Week 4
Rails 3 Beginner to Builder 2011 Week 4
Richard Schneeman
 
Rails 3 Beginner to Builder 2011 Week 3
Rails 3 Beginner to Builder 2011 Week 3Rails 3 Beginner to Builder 2011 Week 3
Rails 3 Beginner to Builder 2011 Week 3
Richard Schneeman
 
Rails 3 Beginner to Builder 2011 Week 2
Rails 3 Beginner to Builder 2011 Week 2Rails 3 Beginner to Builder 2011 Week 2
Rails 3 Beginner to Builder 2011 Week 2
Richard Schneeman
 
Rails 3 Beginner to Builder 2011 Week 1
Rails 3 Beginner to Builder 2011 Week 1Rails 3 Beginner to Builder 2011 Week 1
Rails 3 Beginner to Builder 2011 Week 1
Richard Schneeman
 
Potential Friend Finder
Potential Friend FinderPotential Friend Finder
Potential Friend Finder
Richard Schneeman
 
Rails3 Summer of Code 2010 - Week 6
Rails3 Summer of Code 2010 - Week 6Rails3 Summer of Code 2010 - Week 6
Rails3 Summer of Code 2010 - Week 6Richard Schneeman
 
Rails3 Summer of Code 2010- Week 5
Rails3 Summer of Code 2010- Week 5Rails3 Summer of Code 2010- Week 5
Rails3 Summer of Code 2010- Week 5
Richard Schneeman
 
UT on Rails3 2010- Week 4
UT on Rails3 2010- Week 4 UT on Rails3 2010- Week 4
UT on Rails3 2010- Week 4
Richard Schneeman
 
UT on Rails3 2010- Week 2
UT on Rails3 2010- Week 2UT on Rails3 2010- Week 2
UT on Rails3 2010- Week 2
Richard Schneeman
 
UT on Rails3 2010- Week 1
UT on Rails3 2010- Week 1UT on Rails3 2010- Week 1
UT on Rails3 2010- Week 1
Richard Schneeman
 

More from Richard Schneeman (13)

Rails 3 Beginner to Builder 2011 Week 8
Rails 3 Beginner to Builder 2011 Week 8Rails 3 Beginner to Builder 2011 Week 8
Rails 3 Beginner to Builder 2011 Week 8
 
Rails 3 Beginner to Builder 2011 Week 6
Rails 3 Beginner to Builder 2011 Week 6Rails 3 Beginner to Builder 2011 Week 6
Rails 3 Beginner to Builder 2011 Week 6
 
Rails 3 Beginner to Builder 2011 Week 5
Rails 3 Beginner to Builder 2011 Week 5Rails 3 Beginner to Builder 2011 Week 5
Rails 3 Beginner to Builder 2011 Week 5
 
Rails 3 Beginner to Builder 2011 Week 4
Rails 3 Beginner to Builder 2011 Week 4Rails 3 Beginner to Builder 2011 Week 4
Rails 3 Beginner to Builder 2011 Week 4
 
Rails 3 Beginner to Builder 2011 Week 3
Rails 3 Beginner to Builder 2011 Week 3Rails 3 Beginner to Builder 2011 Week 3
Rails 3 Beginner to Builder 2011 Week 3
 
Rails 3 Beginner to Builder 2011 Week 2
Rails 3 Beginner to Builder 2011 Week 2Rails 3 Beginner to Builder 2011 Week 2
Rails 3 Beginner to Builder 2011 Week 2
 
Rails 3 Beginner to Builder 2011 Week 1
Rails 3 Beginner to Builder 2011 Week 1Rails 3 Beginner to Builder 2011 Week 1
Rails 3 Beginner to Builder 2011 Week 1
 
Potential Friend Finder
Potential Friend FinderPotential Friend Finder
Potential Friend Finder
 
Rails3 Summer of Code 2010 - Week 6
Rails3 Summer of Code 2010 - Week 6Rails3 Summer of Code 2010 - Week 6
Rails3 Summer of Code 2010 - Week 6
 
Rails3 Summer of Code 2010- Week 5
Rails3 Summer of Code 2010- Week 5Rails3 Summer of Code 2010- Week 5
Rails3 Summer of Code 2010- Week 5
 
UT on Rails3 2010- Week 4
UT on Rails3 2010- Week 4 UT on Rails3 2010- Week 4
UT on Rails3 2010- Week 4
 
UT on Rails3 2010- Week 2
UT on Rails3 2010- Week 2UT on Rails3 2010- Week 2
UT on Rails3 2010- Week 2
 
UT on Rails3 2010- Week 1
UT on Rails3 2010- Week 1UT on Rails3 2010- Week 1
UT on Rails3 2010- Week 1
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 

Scaling the Web: Databases & NoSQL

  • 1. Scaling the Web: Databases & NoSQL Richard Schneeman Wed Nov 10 @schneems works for @Gowalla 2011
  • 2. whoami • @Schneems • BSME with Honors from Georgia Tech • 5 + years experience Ruby & Rails • Work for @Gowalla • Rails 3.1 contributor : ) • 3 + years technical teaching
  • 4. Compounding Traffic ex. Wikipedia
  • 5. Compounding Traffic ex. Wikipedia
  • 7. Gowalla • 50 best websites NYTimes 2010 • Founded 2009 @ SXSW • 1 million+ Users • Undisclosed Visitors • Loves/highlights/comments/stories/guides • Facebook/Foursquare/Twitter integration • iphone/android/web apps • public API
  • 8.
  • 9. Gowalla Backend • Ruby on Rails • Uses the Ruby Language • Rails is the Framework
  • 10. The Web is Data • Username => String • Birthday => Int/ Int/ Int • Blog Post => Text • Image => Binary-file/blob Data needs to be stored to be useful
  • 12. Gowalla Database • PostgreSQL • Relational (RDBMS) • Open Source • Competitor to MySQL • ACID compliant • Running on a Dedicated Managed Server
  • 13. Need for Speed • Throughput: • The number of operations per minute that can be performed • Pure Speed: • How long an individual operation takes.
  • 14. Potential Problems • Hardware • Slow Network • Slow hard-drive • Insufficient CPU • Insufficient Ram • Software • too many Reads • too many Writes
  • 15. Scaling Up versus Out • Scale Up: • More CPU, Bigger HD, More Ram etc. • Scale Out: • More machines • More machines • More machines • ...
  • 16. Scale Up • Bigger faster machine • More Ram • More CPU • Bigger ethernet bus • ... • Moores Law • Diminishing returns
  • 17. Scale Out • Forget Moores law... • Add more nodes • Master/ Slave Database • Sharding
  • 18. Master/Slave Write Master DB Copy Slave DB Slave DB Slave DB Slave DB Read
  • 19. Master & Slave +/- • Pro • Increased read speed • Takes read load off of master • Allows us to Join across all tables • Con • Doesn’t buy increased write throughput • Single Point of Failure in Master Node
  • 20. Sharding Write Users in Users in Users in Users in USA Europe Asia Africa Read
  • 21. Sharding +/- • Pro • Increased Write & Read throughput • No Single Point of failure • Individual features can fail • Con • Cannot Join queries between shards
  • 22. What is a Database? • Relational Database Managment System (RDBMS) • Stores Data Using Schema • A.C.I.D. compliant • Atomic • Consistent • Isolated • Durable
  • 23. RDBMS • Relational • Matches data on common characteristics in data • Enables “Join” & “Union” queries • Makes data modular
  • 24. Relational +/- • Pros • Data is modular • Highly flexible data layout • Cons • Getting desired data can be tricky • Over modularization leads to many join queries • Trade off performance for search-ability
  • 25. Schema Storage • Blueprint for data storage • Break data into tables/columns/rows • Give data types to your data • Integer • String • Text • Boolean • ...
  • 26. Schema +/- • Pros • Regularize our data • Helps keep data consistent • Converts to programming “types” easily • Cons • Must seperatly manage schema • Adding columns & indexes to existing large tables can be painful & slow
  • 27. ACID • Properties that guarante a reliably transaction are processed database • Atomic • Consistent • Isolated • Durable
  • 28. ACID • Atomic • Any database Transaction is all or nothing. • If one part of the transaction fails it all fails “An Incomplete Transaction Cannot Exist”
  • 29. ACID • Consistent • Any transaction will take the another from one consistent state to database “Only Consistent data is allowed to be written”
  • 30. ACID • Isolated • No transaction should be able to interfere with another transaction “the same field cannot be updated by two sources at the exact same time” } a = 0 a += 1 a = ?? a += 2
  • 31. ACID • Durable • Onceway that a transaction Is committed it will stay “Save it once, read it forever”
  • 32. What is a Database? • RDBMS • Relational • Flexible • Has a schema • Most likely ACID compliant • Typically fast under low load or when optimized
  • 33. What is SQL? • Structured Query Language • The language databases speak • Based on relational algebra • Insert • Query • Update • Delete “SELECT Company, Country FROM Customers WHERE Country = 'USA' ”
  • 34. Why people <3 SQL • Relational algebra is powerful • SQL is proven • well understood • well documented
  • 35. Why people </3 SQL • Relational algebra Is hard • Different databases support different SQL syntax • Yet another programming language to learn
  • 36. SQL != Database • SQL is used to talk to a RDBMS (database) • SQL is not a RDBMS
  • 37. What is NoSQL? Not A Relational Database
  • 38. RDBMS
  • 39. Types of NoSQL • Distributed Systems • Document Store • Graph Database • Key-Value Store • Eventually Consistent Systems Mix And Match ↑
  • 40. Key Value Stores • Non Relational • Typically No Schema • Map one Key (a string) to a Value (some object) Example: Redis
  • 41. Key Value Example redis = Redis.new redis.set(“foo”, “bar”) redis.get(“foo”) >> “bar”
  • 42. Key Value Example redis = Redis.new Key Value redis.set(“foo”, “bar”) Key redis.get(“foo”) Value >> “bar”
  • 43. Key Value • Like a databse that can only ever use primary Key (id) YES select * from users where id = ‘3’; NO select * from users where name = ‘schneems’;
  • 44. NoSQL @ Gowalla • Redis (key-value store) • Store “Likes” & Analytics • Memcache (key-value store) • Cache Database results • Cassandra • (eventually consistent, with-schema, key value store) • Store “feeds” or “timelines” • Solr (search index)
  • 45. Memcache • Key-Value Store • Open Source • Distributed • In memory (ram) only • fast, but volatile • Not ACID • Memory object caching system
  • 46. Memcache Example memcache = Memcache.new memcache.set(“foo”, “bar”) memcache.get(“foo”) >> “bar”
  • 47. Memcache • Can store whole objects memcache = Memcache.new user = User.where(:username => “schneems”) memcache.set(“user:3”, user) user_from_cache = memcache.get(“user:3”) user_from_cache == user >> true user_from_cache.username >> “Schneems”
  • 48. Memcache @ Gowalla • Cache Common Queries • Decreases Load on DB (postgres) • Enables higher throughput from DB • Faster response than DB • Users see quicker page load time
  • 49. What to Cache? • Objects that change infrequently • users • spots (places) • etc. • Expensive(ish) sql queries • Friend ids for users • User ids for people visiting spots • etc.
  • 51. Memcache Distributed Easily add more nodes A D B C
  • 52. Memcache <3’s DB • We use them Together • If memcache doesn’t have a value • Fetch from the database • Set the key from database • Hard • Cache Invalidation : (
  • 53. Redis • Key Value Store • Open Source • Not Distributed (yet) • Extremely Quick • “Data structure server”
  • 54. Redis Example, again redis = Redis.new redis.set(“foo”, “bar”) redis.get(“foo”) >> “bar”
  • 55. Redis - Has Data Types • Strings • Hashes • Lists • Sets • Sorted Sets
  • 56. Redis Example, sets redis = Redis.new redis.sadd(“foo”, “bar”) redis.members(“foo”) >> [“bar”] redis.sadd(“foo”, “fly”) redis.members(“foo”) >> [“bar”, “fly”]
  • 57. Redis => Likeable • Very Fast response • ~ 50 queries per page view • ~ 1 ms per query • http://github.com/Gowalla/likeable
  • 58. Cassandra • Open Source • Distributed • Key Value Store • Eventually Consistent • Sortof not ACID • Uses A Schema • ColumnFamilies
  • 59. Cassandra Distributed Eventual Consistency A D Copied To Extra Nodes ... Eventually Data In B C
  • 60. Cassandra { @ Gowalla Activity Feeds
  • 61. Cassandra @ Gowalla • Chronologic • http://github.com/Gowalla/chronologic
  • 65. Tradeoffs • Every Data store has them • Know your data store • Strengths • Weaknesses
  • 66. NoSQL vs. RDBMS • No Magic Bullet • Use Both!!! • Model data in a datastore you understand • Switch to when/if you need to • Understand Your Options