QCon beijing 2011



                                                                   MongoDB




: http://czone.chinavisual.com/art/4b4501548f47e8ef73699a0c.html
About.me

•         ( nightsailer)

    • @nightsailer //twitter,sina,linkedin,
      github ...
    • nightsailer # gmail.com
    • http://nightsailer.com/
MongoDB

  NoSQL      ?
Auto-shard           ?

• No! 08
                           K/V
    MySQL
App




               MMM
                                                                  • MySQL   Percona
                                           Mysql(Master-Master)
                                                                   Master-Master-Slaves
                                                                  • HA:MMM
Mysql(M1)                      Mysql(M2)




       Slave           Slave
•
•       schema

    •
•
•
MySQL
*                JSON

    • schema
    •
*

    • schema
    •          query
App




            MMM/vdb11




                                                        KV
                                            KV1



                                                  KV2

Mysql(M1)                 Mysql(M2)




       Slave      Slave
Memcached

•
     Memcached

•                KV
•   PHP/Perl

•   Memcached

•
•
•
• Flare
• Repcached
• Redis
• TC/TT
Flare

•      cluster     ,


•      Memcached
1

•
•
    •
    •
    •       ;-(
• Cassandra
 •
• CouchDB
 •
MongoDB
•                Redis

• Document
•            Redis

• MySQL              ,

• MySQL
MySQL   MongoDB

•          MySQL     MongoDB


•        MySQL <=> MongoDB


•
MySQL

• Transaction
• Joins
•
•
•1   1            90%

•     35   table => 10 collection

•            happy!
MongoDB,

           GridFS

•
•
    •   MogileFS
MongoDB,

• SourceForge
•
• 10gen         mailing-list


• NoSQL                MongoDB
1
•
    comments:{

      _id:ObjectId(‘xxx’),

      art_id:2,

      content: ‘           ...’,

      replied_on:    12233

      created_on: 12222

      replies: [{

        _id: ObjectId(‘xxx’),

        content: ‘     ,     ...’,

        replies:[]

      }]}
2
           /       :

•
•
•
               /       /
    ...)

•
2


• db.activity_stream.feed (
• db.activity_stream.user         )

• db.activity_stream(
2

feed

•
    •                follower
        collection
    •                  embed list
3:
    1: Regex

•
    2: Sphinx

•              xml

•
3:
        3:        Array/List      ($all)

•
•            PHP-SCWS

•             :

    •                          oplog
MongoDB

• Scons,         Python

• boost(CentOS)
• static link mongod
• tcmalloc
•                MongoDB( ICC
COMMON_CXXFLAGS='-fp-model source -unroll2   -axSSE4.1,SSE4.2 -xSSE3 -
static-intel -fpic -fno-strict-aliasing'

CXXFLAGS="-O3 -ipo   -static-libgcc $COMMON_CXXFLAGS"

scons --release --static --extrapath=/opt/local --cxx=$CXX --icc --
extralib='tcmalloc_minimal' --icc-cxxflags="$CXXFLAGS" --icc-
cppflags="$CPPFLAGS" -c $BIN_SERVER

scons -j4 --release --static --extrapath=/opt/local --cxx=$CXX --icc
--extralib='tcmalloc_minimal' --icc-cxxflags="$CXXFLAGS" --icc-
cppflags="$CPPFLAGS" $BIN_SERVER
•
•
•            Raid10

•
    • XFS
    • Ext4 (?)
PHP-FPM
                                          2009/6,       0.9/1.0

                                    • 1 Master + 2 slaves
                      Nginx

              Mongod Master)




                  Gearman-workers   • 1m
                                    • 20g
                  Mongod Slave)




Gearman-workers
                                    • Dell 2850/4g(Master)   2*Dell
                                      2950/4g(
*          Slave

•
* lvm snapshot
•
• fsync & lock db
* mongodump

•
• mongostat / vmstat / iostat
• collectd
  •        : json-rest+perl plugin

•         Munion / Nagios ...
MongoDB

                              • CPU
                              •                                        4G

                              •                                   IO


:http://czone.chinavisual.com/art/4c7918b74979590970b80000.html
6




:http://czone.chinavisual.com/art/4b45015496ddabef73f49197.html
Why?
•                   Map/Reduce       /


•
•                                  1.6+

     •
**       reIndex repairDatabase,
0:

* MongoDB

• Out of Memory!
•        v1.3
*       cursor

• Perl driver    bug,
0:
    1:

•
•                          2
    2:

•
              driver

•                 driver
1:50x
                                             502 Bad Gateway
                                             * GridFS

                                                 •      Perl Plack
  Starman/Plack                   Nginx




                  Proxy store




MongoDB
                                Disk Cache
                                             *       GridFS

                                                 • Nginx   proxy_store
                                                  •
1:50x
504 Gateway timeout
    •
    •
    • Perl            nginx_errorlog
             5          mongodb.log

•                                 client
1:50x
    3

•               XFS                      pre-allocation FS

•
        for i in {1..50}
        do
            echo $i
            head -c 2146435072 /dev/zero > $db.$i
        done
2:

• Mongod crash
•                Why?

  •               Map/Reduce   /


 •
: http://czone.chinavisual.com/art/4cad0e08497959d621a10000.html
•
                      • 10
                      • MongoDB
                       • Repair                                    5

                       •
                       •       10
: http://czone.chinavisual.com/art/4cad0e08497959d621a10000.html
• MongoDB
 •
 • cluster   :   slave

 •
1

•                      --syndelay

    • 60s(default) => 15~30s
    • IO
•                          fsync

    •
2

•          1.6.3,     Master-Slaves RelicaSets

    • 1 Primary + 2 Secondary
•                   4g-8g

•                     w=2

    •
5

•
•
•
•
:http://czone.chinavisual.com/art/4bd19d3b4979593e1a350000.html
4: RS fail-over


• MongoDB
 • Primary             kill   2
     secondary
 •         secondary
4: RS fail-over

•          2   Arbitor
    •   $ mongod --bind_ip 127.0.0.1,192.168.8.10 --
        replSet rs10 --oplogSize 1 ...

    • > rs.addArb(‘192.168.8.10:27020’)
    • ...
4: RS fail-over

    6:             ReplicaSet

• 1 Primary + 2 Secondary + n Arbitor
•
•
•6          “ ”

  •                     ;-(

  •
•          ReplicaSet   fail-over
  secondary

• Primary          ,

 •
1.8

• 1.8
  • journaling file
  • mongod -dur
  • crash            repairDatabase

•
• GridFS
 •
           10mb-500mb

 •
• GridFS
  •           Nginx proxy_store

• MongoDB
•               ,Plack app

  • prefork
• GridFS
  •
•           slaveOK

• Plack app       Twiggy   AnyEvent
Starman workers/PSGI
                              Starman




             RS02
                                                 Nginx




                    Gearman




                      127.0.0.1:9001

                      127.0.0.1:9002

                      127.0.0.1:9003

                      127.0.0.1:9004    Twiggy/PSGI
             RS01




                    (slaveOK)
Sencondary
•
•
•
Primary


                                           •                        Slave only
                               IDC




Secondary1
                                               MongoD
                Secondary2


              ReplicaSet1
                                               • priority       0
                                     VPN

                                           • VPN            2
                         IDC




     Slave1
•               >300ms

    • MongoDB
•       MongoDB

    •   snapshot

•
    •     local.oplog.rs (   tailable cursor)

    •
    •               replay oplog
•        2

• GridFS
•     VPN
•              VPN

•     GridFS

•
    • BSON             HTTP


•                    mongod
7

                                                                   •
                                                                       •
                                                                       •
                                                                   •           unix




: http://czone.chinavisual.com/art/4c05e7be4979596b7e570000.html
Auto-sharding

• 1.6GA   Auto-sharding

 •
•2        1

 •
 •
• Shard_key
 • shard key        chunk

   • 4sq
 • shard_key
• counting
 •          chunk

• balancer
•
    •         shard


    •             shard

•
•       1.8
•
• MongoDB    auto-shard

• GA
  • 2.0+ ?
MongoDB
• MySQL Web
• Schema free
• Geo
• MySQL
• GridFS
•                            sharding

• Auto-sharding shared_key   balancing
             1.8/2.0
Question?




      :http://czone.chinavisual.com/art/4b45015461be3def730e6351.html

MongoDB开发应用实践

  • 1.
    QCon beijing 2011 MongoDB : http://czone.chinavisual.com/art/4b4501548f47e8ef73699a0c.html
  • 2.
    About.me • ( nightsailer) • @nightsailer //twitter,sina,linkedin, github ... • nightsailer # gmail.com • http://nightsailer.com/
  • 3.
    MongoDB NoSQL ? Auto-shard ? • No! 08 K/V MySQL
  • 4.
    App MMM • MySQL Percona Mysql(Master-Master) Master-Master-Slaves • HA:MMM Mysql(M1) Mysql(M2) Slave Slave
  • 5.
    • • schema • • •
  • 6.
    MySQL * JSON • schema • * • schema • query
  • 7.
    App MMM/vdb11 KV KV1 KV2 Mysql(M1) Mysql(M2) Slave Slave
  • 8.
    Memcached • Memcached • KV
  • 9.
    PHP/Perl • Memcached • • •
  • 10.
  • 11.
    Flare • cluster , • Memcached
  • 12.
    1 • • • • • ;-(
  • 13.
  • 14.
    MongoDB • Redis • Document • Redis • MySQL , • MySQL
  • 15.
    MySQL MongoDB • MySQL MongoDB • MySQL <=> MongoDB •
  • 16.
  • 17.
    • •1 1 90% • 35 table => 10 collection • happy!
  • 18.
    MongoDB, GridFS • • • MogileFS
  • 19.
    MongoDB, • SourceForge • • 10gen mailing-list • NoSQL MongoDB
  • 20.
    1 • comments:{ _id:ObjectId(‘xxx’), art_id:2, content: ‘ ...’, replied_on: 12233 created_on: 12222 replies: [{ _id: ObjectId(‘xxx’), content: ‘ , ...’, replies:[] }]}
  • 21.
    2 / : • • • / / ...) •
  • 22.
    2 • db.activity_stream.feed ( •db.activity_stream.user ) • db.activity_stream(
  • 23.
    2 feed • • follower collection • embed list
  • 24.
    3: 1: Regex • 2: Sphinx • xml •
  • 25.
    3: 3: Array/List ($all) • • PHP-SCWS • : • oplog
  • 26.
    MongoDB • Scons, Python • boost(CentOS) • static link mongod • tcmalloc
  • 27.
    MongoDB( ICC COMMON_CXXFLAGS='-fp-model source -unroll2 -axSSE4.1,SSE4.2 -xSSE3 - static-intel -fpic -fno-strict-aliasing' CXXFLAGS="-O3 -ipo -static-libgcc $COMMON_CXXFLAGS" scons --release --static --extrapath=/opt/local --cxx=$CXX --icc -- extralib='tcmalloc_minimal' --icc-cxxflags="$CXXFLAGS" --icc- cppflags="$CPPFLAGS" -c $BIN_SERVER scons -j4 --release --static --extrapath=/opt/local --cxx=$CXX --icc --extralib='tcmalloc_minimal' --icc-cxxflags="$CXXFLAGS" --icc- cppflags="$CPPFLAGS" $BIN_SERVER
  • 28.
  • 29.
    Raid10 • • XFS • Ext4 (?)
  • 30.
    PHP-FPM 2009/6, 0.9/1.0 • 1 Master + 2 slaves Nginx Mongod Master) Gearman-workers • 1m • 20g Mongod Slave) Gearman-workers • Dell 2850/4g(Master) 2*Dell 2950/4g(
  • 31.
    * Slave • * lvm snapshot • • fsync & lock db * mongodump •
  • 32.
    • mongostat /vmstat / iostat • collectd • : json-rest+perl plugin • Munion / Nagios ...
  • 33.
    MongoDB • CPU • 4G • IO :http://czone.chinavisual.com/art/4c7918b74979590970b80000.html
  • 34.
  • 35.
    Why? • Map/Reduce / • • 1.6+ • ** reIndex repairDatabase,
  • 36.
    0: * MongoDB • Outof Memory! • v1.3 * cursor • Perl driver bug,
  • 37.
    0: 1: • • 2 2: • driver • driver
  • 38.
    1:50x 502 Bad Gateway * GridFS • Perl Plack Starman/Plack Nginx Proxy store MongoDB Disk Cache * GridFS • Nginx proxy_store •
  • 39.
    1:50x 504 Gateway timeout • • • Perl nginx_errorlog 5 mongodb.log • client
  • 40.
    1:50x 3 • XFS pre-allocation FS • for i in {1..50} do echo $i head -c 2146435072 /dev/zero > $db.$i done
  • 41.
    2: • Mongod crash • Why? • Map/Reduce / •
  • 42.
  • 43.
    • 10 • MongoDB • Repair 5 • • 10 : http://czone.chinavisual.com/art/4cad0e08497959d621a10000.html
  • 44.
    • MongoDB • • cluster : slave •
  • 45.
    1 • --syndelay • 60s(default) => 15~30s • IO • fsync •
  • 46.
    2 • 1.6.3, Master-Slaves RelicaSets • 1 Primary + 2 Secondary • 4g-8g • w=2 •
  • 47.
  • 48.
  • 49.
    4: RS fail-over •MongoDB • Primary kill 2 secondary • secondary
  • 50.
    4: RS fail-over • 2 Arbitor • $ mongod --bind_ip 127.0.0.1,192.168.8.10 -- replSet rs10 --oplogSize 1 ... • > rs.addArb(‘192.168.8.10:27020’) • ...
  • 51.
    4: RS fail-over 6: ReplicaSet • 1 Primary + 2 Secondary + n Arbitor • •
  • 52.
    •6 “ ” • ;-( • • ReplicaSet fail-over secondary • Primary , •
  • 53.
    1.8 • 1.8 • journaling file • mongod -dur • crash repairDatabase •
  • 54.
    • GridFS • 10mb-500mb •
  • 55.
    • GridFS • Nginx proxy_store • MongoDB • ,Plack app • prefork
  • 56.
    • GridFS • • slaveOK • Plack app Twiggy AnyEvent
  • 57.
    Starman workers/PSGI Starman RS02 Nginx Gearman 127.0.0.1:9001 127.0.0.1:9002 127.0.0.1:9003 127.0.0.1:9004 Twiggy/PSGI RS01 (slaveOK) Sencondary
  • 58.
  • 59.
    Primary • Slave only IDC Secondary1 MongoD Secondary2 ReplicaSet1 • priority 0 VPN • VPN 2 IDC Slave1
  • 60.
    >300ms • MongoDB
  • 61.
    MongoDB • snapshot • • local.oplog.rs ( tailable cursor) • • replay oplog
  • 62.
    2 • GridFS • VPN
  • 63.
    VPN • GridFS • • BSON HTTP • mongod
  • 64.
    7 • • • • unix : http://czone.chinavisual.com/art/4c05e7be4979596b7e570000.html
  • 65.
    Auto-sharding • 1.6GA Auto-sharding • •2 1 • •
  • 66.
    • Shard_key •shard key chunk • 4sq • shard_key • counting • chunk • balancer
  • 67.
    • shard • shard • • 1.8
  • 68.
    • • MongoDB auto-shard • GA • 2.0+ ?
  • 69.
    MongoDB • MySQL Web •Schema free • Geo • MySQL • GridFS • sharding • Auto-sharding shared_key balancing 1.8/2.0
  • 70.
    Question? :http://czone.chinavisual.com/art/4b45015461be3def730e6351.html