Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MongoDB: Intro & Application for Big Data

8,259 views

Published on

  • Be the first to comment

MongoDB: Intro & Application for Big Data

  1. http://blog.nahurst.com/visual-guide-to-nosql-systems
  2. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "timestamp" : ISODate("2011-05-13T14:22:46.777Z"), "binary" : BinData(0,""), "string" : "abc", "number" : 3, "subobj" : {"subA": 1, "subB": { "subsubC": 2 }}, "array" : [1, 2, 3], "dbref" : [_id1, _id2, _id3]}
  3. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "nickname" : "doryokujin"},{ "_id" : ObjectId("4dcd3ebc9278000000005159"), "firstname" : "Takahiro", "lastname" : "Inoue", "mail" : "mr.stoicman@gmail.com", "twitter" : "@doryokujin"},...
  4. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "timestamp" : ISODate("2011-05-13T14:22:46.777Z"), "binary" : BinData(0,""), "string" : "abc", "number" : 3, "subobj" : {"subA": 1, "subB": 2 }, "array" : [1, 2, 3], padding}
  5. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "timestamp" : ISODate("2011-05-13T14:22:46.777Z"), "binary" : BinData(0,""), "string" : "def", "number" : 4, "subobj" : {"subA": 1, "subB": 2 }, "array" : [1, 2, 3, 4, 5, 6], "newkey" : "In-place"}
  6. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "timestamp" : ISODate("2011-05-13T14:22:46.777Z"), "binary" : BinData(0,""), "string" : "abc", "number" : 3, "subobj" : {"subA": 1, "subB": 2 }, "array" : [1, 2, 3],}
  7. Cluster Shard Servers (Data) config Servers (Shard Configration) shard1 shard2 shard3 [ a, f ) [ k, n) [ o, t ) Chunk [ f, k ) [ n, o ) [ t, } ) mongos Servers (Routers)
  8. Shard 1 ( mongosprimary primary ) Shardcinfig Shard mongos
  9. Every Server:Large HDD(500GB )Large Memory(16GB ) Slave Delay in Master Data on Preparation for AmazonS3 User Error Master Data on S3 Non Sharding, Replica Set
  10. From Text Logs: (Large)From mySQL: (Small)From Other NoSQL: (Middle)Temporary Raw data storage is HDFS, not MongoDB We only need result data to discover new features
  11. Reduction First SecondAggregation Aggregation
  12. ScientificPython
  13. Fluent Structured logging Pluggable architecture Reliable forwardinge Event Collector Service
  14. Fluent Structured logging Pluggable architecture Reliable forwarding e Event Collector ServiceSadayuki FuruhashiTreasure Data, Inc.@frsyuki
  15. “2011-04-01 host1 myapp: cmessage size=12MB user=me”2011-04-01 myapp.message { “on_host”: ”host1”, 2011-04-01 myapp.message { ”combined”: true, “on_host”: ”host1”, ”combined”: true, “size”: 12000000, “size”: 12000000, “user”: “me” “user”: “me” }}
  16. log log log log
  17. aggregate aggregate aggregate aggregate log log log log key1 key2 key3 shuffle aggregate aggregate aggregate aggregate

×