Your SlideShare is downloading. ×

Look Ma! No more blobs

786
views

Published on

GridFS is a storage mechanism for persisting large binary data in MongoDB.

GridFS is a storage mechanism for persisting large binary data in MongoDB.

Published in: Technology, News & Politics

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
786
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Look Ma!No more blobsAparna ChaudharyNoSQL matters, @Cologne Germany 2013
  • 2. EMBRACEPOLYGLOTPERSISTENCE!STOPRDBMS ABUSE!KNOW YOURUSE CASE
  • 3. ParseExtractStoreRead XMLWe dont do rocketscience...Use CaseRuntime support fordocument typesMetadata definitionprovided at runtimeDocument type names -max 50 charLook up content basedon metadataRA
  • 4. ChallengesStorage of up to onemillion documents of10KB to 2GB perdocument type per yearWrite 1MB < x msecRetrieve 1MB < y msec......and detailsRABut…the Numbers make itinteresting...
  • 5. How?FileSystemMongoDBRDBMSJCRDocumentManagement
  • 6. if you want to store files,its logical to use file system.aint it?File System✓ Ease of Use✓ No special skill-set✓ Backup and Recovery✓ It’s free!
  • 7. How do I name them?Support for metadata storage?Performance with too many smallfiles?Query - Administration?HighAvailability?Limitation ontotal number offiles?
  • 8. Relational databaseIntegrityConsistencyDurabilityAtomicityJoinsBackupsHigh AvailabilityYou name it, We have it!RDBMSAggregations
  • 9. RDBMSDeveloper’s Perspective
  • 10. Challenge #1RAWe need runtime supportfor document type.RAWe need runtime supportfor document type.
  • 11. Challenge #1DOC_1 DOC_2 DOC_3DOC_4 DOC_5 DOC_6Dynamic DDL GenerationDOC_1 DOC_2 DOC_3DOC_4 DOC_5 DOC_6Dynamic DDL Generation
  • 12. Challenge #1String concatenationsare ugly…DEVString concatenationsare ugly…DEV
  • 13. Challenge #1Lets build a utility.DEVLets build a utility.DEV
  • 14. Challenge #1More WorkMore Work
  • 15. Challenge #2RADocument type is 50 charlongRADocument type is 50 charlong
  • 16. Challenge #2TABLE NAME LIMITSWait…SQL-92 says 128 Char?We rule. Lets support only30 char.TABLE NAME LIMITSWait…SQL-92 says 128 Char?We rule. Lets support only30 char.
  • 17. Challenge #2DOC_TYPE_MAPPINGLets create a mappingtable.DEVDOC_TYPE_MAPPINGLets create a mappingtable.DEV
  • 18. Challenge #2Ugly unreadable tablenames!Ugly unreadable tablenames!
  • 19. So...finally...Read XMLDynamic DDLgenerationDocument Type AliasDocumentTypeDefinedYesNoExtract MetadataStore MetadataStore ContentSimple use casebecomes complex...
  • 20. Remember...Our ChallengeQALets see if we are in specfor response time.Aah..what aboutperformance now?DEV
  • 21. MongoDBDocument BasedGridFSB-TreeDynamic SchemaJSONBSONQueryScalablehttp://www.10gen.com/presentations/storage-engine-internalsJoinsComplexTransaction
  • 22. F1 F2 F3 F4 F5ID1ID2ID3ID4ID5F1F1F1F1F2F2 F3 F4 F5 F6F2 F3 F4 F5 FxF8F3F9 F7ConceptsDatabaseCollectionCollection Collection CollectionCollectionCollectionDatabaseCollectionCollection Collection CollectionCollectionCollectionDatabaseCollectionCollection Collection CollectionCollectionCollectionDatabaseCollectionCollection Collection CollectionCollectionCollectionTable = CollectionColumn = FieldRow = DocumentDatabase = Database
  • 23. GridFSMongoDB divides thelarge content intochunksStores Metadataand Chunksseparatelyhttp://docs.mongodb.org/manual/core/gridfs/
  • 24. > mybucket.files{ "_id" : ObjectId("514d5cb8c2e6ea4329646a5c"),"chunkSize" : NumberLong(262144),"length" : NumberLong(103015),"md5" : "34d29a163276accc7304bd69c5520e55","filename" : "health_record_2.xml","contentType" : application/xml,"uploadDate" : ISODate("2013-03-23T07:41:44.907Z"),"aliases" : null,"metadata" : { "fname" : "Aparna", "lname" : "Chaudhary","country" :"Netherlands" }}ObjectId - 12 Byte BSON:4 Byte - Seconds since Epoch3 Byte - Machine Id2 Byte - Process Id3 Byte - Counter
  • 25. > mybucket.chunks{ "_id" :ObjectId("514d5cb8c2e6ea4329646a5d"),"files_id" :ObjectId("514d5cb8c2e6ea4329646a5c"),"n" : 0,"data" : BinData(0,...)}
  • 26. ?Im storing 10KB file, butwould it use 256KB on disk?Last Chunk =FileSize % 256+Metadata overhead2561128KB256 256 256104+ x10KB10+ xChunk is asbig as itneeds to be...
  • 27. Challenge #1DEVMongoDB supports DynamicSchema.You can use collection perdocType and they arecreated dynamically.RAWe need runtime supportfor document type.
  • 28. Challenge #2RADocument type is 50 charlongDEVMongoDB namespace canbe up to 123 char.
  • 29. So...finally...Simple use caseremains simple...well becomessimpler...Read XMLExtract MetadataStore Metadata &Content
  • 30. Remember...Our ChallengeQALets see if we are in specfor response time.DEVPerformance test is part ofour definition of DONE
  • 31. BEcause seeing is believing!Demo‣ GridFS 2.4.0‣ PostgreSQL 9.2‣ Spring Data‣ JMeter 2.7‣ Mac OS X 10.8.3 2.3GHzQuad-Core Intel Core i7,16GB RAMhttps://github.com/aparnachaudhary/nosql-matters-demo
  • 32. EMBRACEPOLYGLOTPERSISTENCE!STOPRDBMS ABUSE!KNOW YOURUSE CASE@aparnachaudhary
  • 33. Java Developer, Data LoverEindhoven, Netherlandshttp://blog.aparnachaudhary.com/@aparnachaudharyThank You!