This presentation about Azure Data Factory will help you understand what is a Data Factory, why we need Data Factory, what is a Data Lake along with a demo on Azure Data Factory. Azure Data Factory is one of the most important services offered by Azure. The data generated by digital products is increasing exponentially and there is a lot of data being accumulated from different streamlines. So, it becomes a big task in storing and analyzing this data. That's where Azure Data Factory comes into play. Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data. Azure Data Factory stores the data with the help of Data Lake storage, this data will be analyzed with the help of pipelines and then published in an organized manner. Now, let us get started and understand Azure Data Factory in detail.
Below topics are explained in this Azure data factory presentation:
1. Why Data Factory?
2. What is Azure Data Factory?
3. What is a Data Lake?
4. Demo
About Simplilearn Machine Learning course:
Simplilearn's Developing Microsoft® Azure Solutions (70-532) certification training program is designed to give you mastery in Microsoft Azure enterprise-grade cloud platform. Through demos & practical applications, you’ll learn to design, develop, implement, automate, and monitor resilient and scalable cloud solutions on the Azure platform. The course will enable you to explore Microsoft Azure development environment and Azure platform features and learn development tools, techniques and approaches used to build and deploy cloud apps.
What skills will you learn from this Azure certification training course?
By the end of this Azure certification course, you will be able to:
1. Design and implement Web Apps
2. Create and manage virtual machines
3. Design and implement cloud services
4. Design and implement a storage strategy
5. Manage application and network services
Who should take up this Microsoft Azure certification training course?
This course is an essential requirement for those developers who need a strong understanding of concepts and practices related to cloud app development & deployment. Applicable careers include:
1 .NET Developers
2. Solution Architects/ Team Leads
3. DevOps Engineers / Application Engineers / QA Engineers
Learn more at https://www.simplilearn.com/cloud-computing/microsoft-azure-fundamentals-training.
10. What is Data Factory?
W is touching the left boundary
11. What is Data Factory?
Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and
transformation of data
12. What is Data Factory?
Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data
Now let’s talk about the flow
13. What is Data Factory?
This is the process that is followed in Data Factory
Input
Dataset
14. What is Data Factory?
The data factory works on the following components
pipeline output dataset linked servicesinput dataset gateway cloud
Azure data lake
Block storage
SQL
Input Datasets
• Represents the collection of data within the Data Stores
15. What is Data Factory?
PipelineInput
Dataset
Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data
16. What is Data Factory?
The data factory works on the following components
pipeline output dataset linked servicesinput dataset gateway cloud
Azure data lake
Block storage
SQL
A pipeline consists of group of activities
• Data Movement Activity
• Data Transformation Activity
• USQL
• Stored procedures
• Hive
Pipelines & Activities
17. What is Data Factory?
Pipeline Output
Dataset
Input
Dataset
Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data
18. What is Data Factory?
The data factory works on the following components
pipeline output dataset linked servicesinput dataset gateway cloud
Azure data lake
Block storage
SQL
Output Datasets
• Structured form of Data is available
19. What is Data Factory?
Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data
Pipeline Output
Dataset
Linked
Services
Input
Dataset
Azure data lake
Block storage
SQL
20. What is Data Factory?
The data factory works on the following components
pipeline output dataset linked servicesinput dataset gateway cloud
Azure data lake
Block storage
SQL
• Contains information needed to connect to external sources
• This is very similar to the concept of a connection string in SQL
Server, where you mention the source and destination of your
dataLinked Services
21. What is Data Factory?
Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data
Pipeline Output
Dataset
Linked
Services
Input
Dataset
Gateway
Azure data lake
Block storage
SQL
22. What is Data Factory?
The data factory works on the following components
pipeline output dataset linked servicesinput dataset gateway cloud
Azure data lake
Block storage
SQL
• Connects your on-premises data to the cloud
• It consists of a client agent which is installed on the on-premises
data system, which then connects to the Azure Data
Gateway
23. What is Data Factory?
Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data
Pipeline Output
Dataset
Linked
Services
Input
Dataset
Gateway
Azure data lake
Block storage
SQL
Cloud
24. What is Data Factory?
The data factory works on the following components
pipeline output dataset
linked services
input dataset
gateway cloud
Azure data lake
Block storage
SQL
• The data is analyzed and visualized using the analytical
frameworks
Cloud
26. What is Data Lake?
A highly scalable, distributed, parallel file system in the cloud specifically designed to work with multiple
analytics frameworks
27. What is Data Lake?
A highly scalable, distributed, parallel file system in the cloud specifically designed to work with multiple
analytic frameworks
Mobile Video
Web Social
Output
Dataset
28. What is Data Lake?
A highly scalable, distributed, parallel file system in the cloud specifically designed to work with multiple
analytic frameworks
mobile Video
Web Social
Azure Data Lake Store
Output
Dataset
29. What is Data Lake?
A highly scalable, distributed, parallel file system in the cloud specifically designed to work with multiple
analytic frameworks
mobile Video
Web Social
Azure Data Lake Store
Output
Dataset
External
Frameworks
30. What is Data Lake?
Data Lake works on two main concepts, storage and analytics
31. What is Data Lake?
Storage
• Unlimited storage
PB
TB
GB
PB
Data Lake works on two main concepts, storage and analytics
32. What is Data Lake?
Storage
• Unlimited storage
• Store variety of dataPB
TB
GB
PB
Data Lake works on two main concepts, storage and analytics
33. What is Data Lake?
Storage
• Unlimited storage
• Store variety of data
• Can store large files
PB
TB
GB
PB
Data Lake works on two main concepts, storage and analytics
34. What is Data Lake?
Analytics
• Monitor and diagnose real-time data from connected
devices such as vehicles, buildings, or machinery in
order to generate alerts, respond to events, or optimize
operations
Data Lake works on two main concepts, storage and analytics
35. What is Data Lake?
Analytics
• Monitor and diagnose real-time data from connected
devices such as vehicles, buildings, or machinery in
order to generate alerts, respond to events, or optimize
operations
• Monitor financial transactions in real-time to detect
fraudulent activity. Correlate a credit card’s use across
geographic locations; monitor the number of
transactions on a single credit card
Data Lake works on two main concepts, storage and analytics