Look Ma!No more blobsAparna ChaudharyNoSQL matters, @Cologne Germany 2013
EMBRACEPOLYGLOTPERSISTENCE!STOPRDBMS ABUSE!KNOW YOURUSE CASE
ParseExtractStoreRead XMLWe dont do rocketscience...Use CaseRuntime support fordocument typesMetadata definitionprovided at...
ChallengesStorage of up to onemillion documents of10KB to 2GB perdocument type per yearWrite 1MB < x msecRetrieve 1MB < y ...
How?FileSystemMongoDBRDBMSJCRDocumentManagement
if you want to store files,its logical to use file system.aint it?File System✓ Ease of Use✓ No special skill-set✓ Backup and...
How do I name them?Support for metadata storage?Performance with too many smallfiles?Query - Administration?HighAvailabili...
Relational databaseIntegrityConsistencyDurabilityAtomicityJoinsBackupsHigh AvailabilityYou name it, We have it!RDBMSAggreg...
RDBMSDeveloper’s Perspective
Challenge #1RAWe need runtime supportfor document type.RAWe need runtime supportfor document type.
Challenge #1DOC_1 DOC_2 DOC_3DOC_4 DOC_5 DOC_6Dynamic DDL GenerationDOC_1 DOC_2 DOC_3DOC_4 DOC_5 DOC_6Dynamic DDL Generation
Challenge #1String concatenationsare ugly…DEVString concatenationsare ugly…DEV
Challenge #1Lets build a utility.DEVLets build a utility.DEV
Challenge #1More WorkMore Work
Challenge #2RADocument type is 50 charlongRADocument type is 50 charlong
Challenge #2TABLE NAME LIMITSWait…SQL-92 says 128 Char?We rule. Lets support only30 char.TABLE NAME LIMITSWait…SQL-92 says...
Challenge #2DOC_TYPE_MAPPINGLets create a mappingtable.DEVDOC_TYPE_MAPPINGLets create a mappingtable.DEV
Challenge #2Ugly unreadable tablenames!Ugly unreadable tablenames!
So...finally...Read XMLDynamic DDLgenerationDocument Type AliasDocumentTypeDefinedYesNoExtract MetadataStore MetadataStore ...
Remember...Our ChallengeQALets see if we are in specfor response time.Aah..what aboutperformance now?DEV
MongoDBDocument BasedGridFSB-TreeDynamic SchemaJSONBSONQueryScalablehttp://www.10gen.com/presentations/storage-engine-inte...
F1 F2 F3 F4 F5ID1ID2ID3ID4ID5F1F1F1F1F2F2 F3 F4 F5 F6F2 F3 F4 F5 FxF8F3F9 F7ConceptsDatabaseCollectionCollection Collectio...
GridFSMongoDB divides thelarge content intochunksStores Metadataand Chunksseparatelyhttp://docs.mongodb.org/manual/core/gr...
> mybucket.files{ "_id" : ObjectId("514d5cb8c2e6ea4329646a5c"),"chunkSize" : NumberLong(262144),"length" : NumberLong(10301...
> mybucket.chunks{ "_id" :ObjectId("514d5cb8c2e6ea4329646a5d"),"files_id" :ObjectId("514d5cb8c2e6ea4329646a5c"),"n" : 0,"da...
?Im storing 10KB file, butwould it use 256KB on disk?Last Chunk =FileSize % 256+Metadata overhead2561128KB256 256 256104+ x...
Challenge #1DEVMongoDB supports DynamicSchema.You can use collection perdocType and they arecreated dynamically.RAWe need ...
Challenge #2RADocument type is 50 charlongDEVMongoDB namespace canbe up to 123 char.
So...finally...Simple use caseremains simple...well becomessimpler...Read XMLExtract MetadataStore Metadata &Content
Remember...Our ChallengeQALets see if we are in specfor response time.DEVPerformance test is part ofour definition of DONE
BEcause seeing is believing!Demo‣ GridFS 2.4.0‣ PostgreSQL 9.2‣ Spring Data‣ JMeter 2.7‣ Mac OS X 10.8.3 2.3GHzQuad-Core I...
EMBRACEPOLYGLOTPERSISTENCE!STOPRDBMS ABUSE!KNOW YOURUSE CASE@aparnachaudhary
Java Developer, Data LoverEindhoven, Netherlandshttp://blog.aparnachaudhary.com/@aparnachaudharyThank You!
Upcoming SlideShare
Loading in...5
×

Look Ma! No more blobs

866

Published on

GridFS is a storage mechanism for persisting large binary data in MongoDB.

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
866
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Look Ma! No more blobs

  1. 1. Look Ma!No more blobsAparna ChaudharyNoSQL matters, @Cologne Germany 2013
  2. 2. EMBRACEPOLYGLOTPERSISTENCE!STOPRDBMS ABUSE!KNOW YOURUSE CASE
  3. 3. ParseExtractStoreRead XMLWe dont do rocketscience...Use CaseRuntime support fordocument typesMetadata definitionprovided at runtimeDocument type names -max 50 charLook up content basedon metadataRA
  4. 4. ChallengesStorage of up to onemillion documents of10KB to 2GB perdocument type per yearWrite 1MB < x msecRetrieve 1MB < y msec......and detailsRABut…the Numbers make itinteresting...
  5. 5. How?FileSystemMongoDBRDBMSJCRDocumentManagement
  6. 6. if you want to store files,its logical to use file system.aint it?File System✓ Ease of Use✓ No special skill-set✓ Backup and Recovery✓ It’s free!
  7. 7. How do I name them?Support for metadata storage?Performance with too many smallfiles?Query - Administration?HighAvailability?Limitation ontotal number offiles?
  8. 8. Relational databaseIntegrityConsistencyDurabilityAtomicityJoinsBackupsHigh AvailabilityYou name it, We have it!RDBMSAggregations
  9. 9. RDBMSDeveloper’s Perspective
  10. 10. Challenge #1RAWe need runtime supportfor document type.RAWe need runtime supportfor document type.
  11. 11. Challenge #1DOC_1 DOC_2 DOC_3DOC_4 DOC_5 DOC_6Dynamic DDL GenerationDOC_1 DOC_2 DOC_3DOC_4 DOC_5 DOC_6Dynamic DDL Generation
  12. 12. Challenge #1String concatenationsare ugly…DEVString concatenationsare ugly…DEV
  13. 13. Challenge #1Lets build a utility.DEVLets build a utility.DEV
  14. 14. Challenge #1More WorkMore Work
  15. 15. Challenge #2RADocument type is 50 charlongRADocument type is 50 charlong
  16. 16. Challenge #2TABLE NAME LIMITSWait…SQL-92 says 128 Char?We rule. Lets support only30 char.TABLE NAME LIMITSWait…SQL-92 says 128 Char?We rule. Lets support only30 char.
  17. 17. Challenge #2DOC_TYPE_MAPPINGLets create a mappingtable.DEVDOC_TYPE_MAPPINGLets create a mappingtable.DEV
  18. 18. Challenge #2Ugly unreadable tablenames!Ugly unreadable tablenames!
  19. 19. So...finally...Read XMLDynamic DDLgenerationDocument Type AliasDocumentTypeDefinedYesNoExtract MetadataStore MetadataStore ContentSimple use casebecomes complex...
  20. 20. Remember...Our ChallengeQALets see if we are in specfor response time.Aah..what aboutperformance now?DEV
  21. 21. MongoDBDocument BasedGridFSB-TreeDynamic SchemaJSONBSONQueryScalablehttp://www.10gen.com/presentations/storage-engine-internalsJoinsComplexTransaction
  22. 22. F1 F2 F3 F4 F5ID1ID2ID3ID4ID5F1F1F1F1F2F2 F3 F4 F5 F6F2 F3 F4 F5 FxF8F3F9 F7ConceptsDatabaseCollectionCollection Collection CollectionCollectionCollectionDatabaseCollectionCollection Collection CollectionCollectionCollectionDatabaseCollectionCollection Collection CollectionCollectionCollectionDatabaseCollectionCollection Collection CollectionCollectionCollectionTable = CollectionColumn = FieldRow = DocumentDatabase = Database
  23. 23. GridFSMongoDB divides thelarge content intochunksStores Metadataand Chunksseparatelyhttp://docs.mongodb.org/manual/core/gridfs/
  24. 24. > mybucket.files{ "_id" : ObjectId("514d5cb8c2e6ea4329646a5c"),"chunkSize" : NumberLong(262144),"length" : NumberLong(103015),"md5" : "34d29a163276accc7304bd69c5520e55","filename" : "health_record_2.xml","contentType" : application/xml,"uploadDate" : ISODate("2013-03-23T07:41:44.907Z"),"aliases" : null,"metadata" : { "fname" : "Aparna", "lname" : "Chaudhary","country" :"Netherlands" }}ObjectId - 12 Byte BSON:4 Byte - Seconds since Epoch3 Byte - Machine Id2 Byte - Process Id3 Byte - Counter
  25. 25. > mybucket.chunks{ "_id" :ObjectId("514d5cb8c2e6ea4329646a5d"),"files_id" :ObjectId("514d5cb8c2e6ea4329646a5c"),"n" : 0,"data" : BinData(0,...)}
  26. 26. ?Im storing 10KB file, butwould it use 256KB on disk?Last Chunk =FileSize % 256+Metadata overhead2561128KB256 256 256104+ x10KB10+ xChunk is asbig as itneeds to be...
  27. 27. Challenge #1DEVMongoDB supports DynamicSchema.You can use collection perdocType and they arecreated dynamically.RAWe need runtime supportfor document type.
  28. 28. Challenge #2RADocument type is 50 charlongDEVMongoDB namespace canbe up to 123 char.
  29. 29. So...finally...Simple use caseremains simple...well becomessimpler...Read XMLExtract MetadataStore Metadata &Content
  30. 30. Remember...Our ChallengeQALets see if we are in specfor response time.DEVPerformance test is part ofour definition of DONE
  31. 31. BEcause seeing is believing!Demo‣ GridFS 2.4.0‣ PostgreSQL 9.2‣ Spring Data‣ JMeter 2.7‣ Mac OS X 10.8.3 2.3GHzQuad-Core Intel Core i7,16GB RAMhttps://github.com/aparnachaudhary/nosql-matters-demo
  32. 32. EMBRACEPOLYGLOTPERSISTENCE!STOPRDBMS ABUSE!KNOW YOURUSE CASE@aparnachaudhary
  33. 33. Java Developer, Data LoverEindhoven, Netherlandshttp://blog.aparnachaudhary.com/@aparnachaudharyThank You!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×