Successfully reported this slideshow.
Your SlideShare is downloading. ×

Montreal Sql saturday: moving data from no sql db to azure data lake

Montreal Sql saturday: moving data from no sql db to azure data lake

Download to read offline

NoSQL database have grown popularity in recent years due to the flexibility of data modeling and scaling up capabilities. NoSQL database also have been using in big data landscape. The demo rich session will elaborate difference between SQL and NoSQL. And end to end solution for data moving capabilities from NoSQL database MongoDB by using Azure data factory.

NoSQL database have grown popularity in recent years due to the flexibility of data modeling and scaling up capabilities. NoSQL database also have been using in big data landscape. The demo rich session will elaborate difference between SQL and NoSQL. And end to end solution for data moving capabilities from NoSQL database MongoDB by using Azure data factory.

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Montreal Sql saturday: moving data from no sql db to azure data lake

  1. 1. SQL vs NoSQL and moving data from MongoDB to Azure data lake by using Azure Data Factory Diponkar Paul
  2. 2. Thanks to our sponsors And Global Microsoft RedHat Google Cloud
  3. 3. Title Content Father and Husband Blogger & Speaker Data Engineer Diverse background Community Twitter: @Paulswengrr Blog: www.allaboutdata.ca
  4. 4. What we cover Refresh our memory with traditional SQL Know about NoSQL (MongoDB) Demo: No SQL Comparison Azure data factory: Copy data from MongoDB Demo: MongoDB with ADF
  5. 5. SQL
  6. 6. SQL Syntax SELECT Id, Product, Price From Product Where ProductCategory=’Bikes’ Join, Insert, Update, Delete
  7. 7. Well defined Schema CREATE TABLE [Production].[Product]( [ProductID] [int] IDENTITY(1,1) NOT NULL, [Name] [nvarchar](100) NOT NULL, [ProductNumber] [nvarchar](25) NOT NULL, [MakeFlag] [dbo].[Flag] NOT NULL, [FinishedGoodsFlag] [dbo].[Flag] NOT NULL, [Color] [nvarchar](15) NULL, [SafetyStockLevel] [smallint] NOT NULL, [StandardCost] [money] NOT NULL, [ListPrice] [money] NOT NULL, [Size] [nvarchar](5) NULL)
  8. 8. Relationship/Normalization Customer Bridge table (Order) Product Id Name Price Description 1 “Mountain Bike “ 2500 “Bike for mountain trek” 2 “City Bike” 1000 “Best fit to roam around city” Id Customer_ID Product_ID 1 2 1 2 2 2 3 1 1 Id Name Email 1 Morten Sorenson m.s@outlook.com 2 Andersen Lu al@yahoo.com 3 Derek Paul dp@outlook.com
  9. 9. Type of relationship
  10. 10. NoSQL
  11. 11. Vendors in the market • MongoDB • Azure Cosmos DB • Amazon Document DB • Oracle NoSQL • Google BigTable
  12. 12. NoSQL-MongoDB
  13. 13. How we call them? Database E-Commerce Collections Table –Customer, Product… Documents {“Name”: ”Anders”, age:36} {“Name”: “Carsten”, age:42}
  14. 14. No defined Schema Id:1 Age:36Name: ‘Anders’ ….. Id:2 Age:36 Name: ‘Carsten’ ….. Id:3 ….. Age:36 ….. Age:36 Name: ‘Carsten’ ….. ….. Id:1 Name: ‘Anders’ Id:2 Age:36 ….. Name: ‘Carsten’ …..
  15. 15. NoSQL –No relation Profession {id:1,profession:’Developer’} {id:2, profession: ’Data Engineer’} {id:3, profession: ’Actor’} Users {id:1,name:’Tom Hanks’, age:20} {id:2,name:’Casper Ruther’, age:42} {id:3,name:’Paul Anders’, age:63} db.Users.insert( { id:"01", name:"Tom Hanks", age:20 email:"th@hollywood.com", Profession:["Developer","Data Engineer","Actor"] } ) Usersprofession {id:1,userId:1,professionId:1} {id:2,userId:1, professionId: 2} {id:3,userId: 1, professionId: 3} {id:4,userId: 2, professionId: 2}
  16. 16. Tools: MongoDB https://www.mongodb.com/products/compass Robo 3T: https://robomongo.org/ https://docs.mongodb.com/manual/core/data-model-design/ https://docs.mongodb.com/manual/reference/method/db.collection.update/
  17. 17. Languages • MONGO SHELL • Python • java • C# • Scala • GO and many more.
  18. 18. Demo
  19. 19. SQL vs NoSQL SQL NoSQL Data uses Schema Schema-less (Schema Agnostic) Maintain Relationship No relations– though you can design relationship Data distributed in multiple tables Data in one table (embedded) Monolithic, you can easily Scale-Up. Scale out is also possible but difficult (e.g. Azure Elastic Database tools) Scale up and scale out- Globally distributed
  20. 20. Move your NoSQL data from OnPrem to Data Lake Gen2
  21. 21. Azure Data Lake Azure Data Lake is a scalable data storage and analytics service -Fully HDFS compliance file system -Azure AD integrated -Microsoft’s PAAS service big data solution
  22. 22. Azure Data Factory • ETL/ELT Tool • Code free • Azure Cloud • a lot more…
  23. 23. Pre-requisite • azure account • Azure data factory resource • Linked services (Source and target connection) • Integration run time Integration Runtime Linked Service
  24. 24. Demo-ADF
  25. 25. Be cautious! • MongoDB version supported for ADF copy activity (V 3.4) *https://docs.microsoft.com/en-us/azure/data-factory/connector-mongodb
  26. 26. Key Takeaways: • SQL vs. NoSQL • Choose SQL/ NoSQL • Integration from OnPrem MongoDB • Starting new NoSQL?- choose cloud solution Azure CosmosDB or Atlas
  27. 27. Questions @paulswengrr Diponkarpaul

×