SlideShare a Scribd company logo
1 of 43
191AIE503T CLOUD COMPUTING UNIT - IV
UNIT –IV
AZURE CLOUD AND CORE SERVICES
Azure Synapse Analytics - HDInsight-Azure Data bricks - Usage of Internet of Things (IoT) Hub-IoT
Central-Azure Sphere-Azure Cloud shell and Mobile Apps
Azure Synapse Analytics
Introduction
In the mid of 2016, Azure made Azure SQL Data Warehouse service generally available for data
warehousing on the cloud. Since then, this service has gone through several iterations, and towards the end
of 2019, Microsoft announced that the Azure SQL Data Warehouse service would be rebranded as Azure
Synapse Analytics. This service is the de-facto service for combining data warehousing and big data
analytics, with many new features of the service in preview as well.
High-Level Architecture
Online Transaction Processing Workloads (OLTP) typically involve transactional data that is voluminous
in terms of high reads and writes. The data access pattern usually involves a lot of scalar and tabular
datasets. And data ingestion generally happens through user transactions in small batches of rows. Online
Analytical Processing (OLAP) applications typically store and process large volumes of data collected from
various sources, which may be transformed and/or modeled in the OLAP repository, and then large datasets
are aggregated for ad-hoc reporting and analytical use-cases. The latter is the use-case where Synapse
Analytics fits in the overall data landscape, as shown below. Azure Data Lake Storage forms the bedrock
of big data storage, and Power BI forms the visualization layer, as shown below.
191AIE503T CLOUD COMPUTING UNIT - IV
Azure Synapse Components and Features
There are multiple components of Synapse Analytics architecture on Azure. Let’s understand all these
components one by one.
 Synapse Analytics is basically an analytics service that has a virtually unlimited scale to support
analytics workloads
 Synapse Workspaces (in preview as of Sept 2020) provides an integrated console to administer and
operate different components and services of Azure Synapse Analytics
191AIE503T CLOUD COMPUTING UNIT - IV
 Synapse Analytics Studio is a web-based IDE to enable code-free or low-code developer experience
to work with Synapse Analytics
 Synapse supports a number of languages like SQL, Python, .NET, Java, Scala, and R that are
typically used by analytic workloads
 Synapse supports two types of analytics runtimes – SQL and Spark (in preview as of Sept 2020)
based that can process data in a batch, streaming, and interactive manner
 Synapse is integrated with numerous Azure data services as well, for example, Azure Data Catalog,
Azure Lake Storage, Azure Databricks, Azure HDInsight, Azure Machine Learning, and Power BI
 Synapse also provides integrated management, security, and monitoring related services to support
monitoring and operations on the data and services supported by Synapse
 Data Lake Storage is suited for big data scale of data volumes that are modeled in a data lake model.
This storage layer acts as the data source layer for Synapse. Data is typically populated in Synapse
from Data Lake Storage for various analytical purposes
Now that we understand different layers or components of the architecture let’s understand the core pillars
of Synapse.
 Azure Synapse Studio – This tool is a web-based SaaS tool that provides developers to work with
every aspect of Synapse Analytics from a single console. In an analytical solution development life-
cycle using Synapse, one generally starts with creating a workspace and launching this tool that
provides access to different synapse features like Ingesting data using import mechanisms or data
pipelines and create data flows, explore data using notebooks, analyze data with spark jobs or SQL
scripts, and finally visualize data for reporting and dash boarding purposes. This tool also provides
191AIE503T CLOUD COMPUTING UNIT - IV
features for authoring artifacts, debugging code, optimizing performance by assessing metrics,
integration with CI/CD tools, etc.
 Azure Synapse Data Integration – There are different tools that can be used to load data into
Synapse. But having an integrated orchestration engine help to reduce dependency and management
of separate tool instances and data pipelines. This service comes with an integrated orchestration
engine that is identical to Azure Data Factory to create data pipelines and rich data transformation
capabilities within the Synapse workspace itself. Key features include support for 90+ data sources
that include almost 15 Azure-based data sources, 26 open-source and cross-cloud data warehouses
and databases, 6 file-based data sources, 3 No SQL based data sources, 28 Services and Apps that
can serve as data providers, as well as 4 generic protocols like ODBC, REST, etc. that can serve
data. Pipelines can be created using built-in templates from Synapse Studio to integrate data from
various sources, as shown below.
191AIE503T CLOUD COMPUTING UNIT - IV
 Synapse SQL Pools – This feature provides the same data warehousing features that were made
available with the earlier versions of this service when it was branded as SQL DW. This feature of
the service available in a provisioned manner where a fixed capacity of DWU units is allocated to
the instance of the service for data processing. Data can be imported into Synapse using different
mechanisms like SSIS, Polybase, Azure Data Factory, etc. Synapse stores data in a columnar format
and enables distributed querying capabilities, which is better suited for the performance of OLAP
workloads. SQL Pools have built-in support for data streaming, as well as few AI functions out-of-
box. Shown below is a screenshot of how Synapse SQL Pool would look.
Generally, Synapse SQL Pools are part of an Azure SQL Server instance and can be browsed using
tools like SSMS as well. Synapse SQL feature is also available in a serverless manner (in preview
as of Sept 2020), where no fixed capacity of the infrastructure needs to be provisioned. Instead,
Azure manages the required infrastructure capacity to meet the needs of the workloads. This is a
data virtualization feature supported by Synapse SQL. The pricing model, in this case, is based on
the data volumes processed instead of the number of DWUs allocated to the instance.
191AIE503T CLOUD COMPUTING UNIT - IV
 Apache Spark for Azure Synapse – This component of Synapse provides Spark runtime to
perform the same set of tasks like data loading, data processing, data preparation, ETLs, and other
tasks that are generally related to data warehousing. Azure provides Data Bricks, too, as a service
that is based on Spark runtime with a certain set of optimizations, which is typically used for a
similar set of purposes. One of the advantages of this feature compared to Azure Databricks is that
no additional or separate clusters need to be managed to process data as this is an integral part of
Synapse, provides Spark-based processing with auto-scaling, support for features like .NET for
Spark, SparkML algorithms, Delta Lake, Azure ML Integration for Apache Spark, Jupyter style
notebooks, etc. In addition, it has multi-language support for languages like C#, Pyspark, Scala,
Spark SQL, Java, etc. Once a Synapse workspace is created, one can provision Apache Spark pools
or Synapse SQL pools from a common interface, as shown below.
191AIE503T CLOUD COMPUTING UNIT - IV
 Azure Synapse Security – Apart from the above features, one key aspect to take note of is the array
of security features packed in Azure Synapse. It is already compliant with almost 30 industry-
leading compliances like ISO, SOC, FedRAMP, DISA, HIPAA, FIPS, etc.
o It supports Azure AD authentication, SQL based authentication as well as Multi-factor
authentication
o It supports data encryption at rest and in transit as well as data classification for sensitive data
o It supports row-level, column-level, as well as object-level security along with dynamic data
masking
o It supports network-level security with virtual networks as well as firewalls
Azure Synapse is a tightly integrated suite of services that cover the entire spectrum of tasks and processes
that are used in the workflow of an analytical solution. These architectural components provide a modular
vision of the entire suite to get a head start.
Azure Synapse Analytics Features
 Centralized Data Management: Azure Synapse utilizes Massively Parallel Processing (MPP)
technology. It allows processing and management of large workloads and efficiently handling large data
volumes. Deliver unified experience by managing both data lakes as well as data warehouses.
 Workload Isolation: This capability allows users to manage the execution of heterogeneous workloads.
It offers increased flexibility by exclusively reserving resources for a specific workload group. This is
while having complete control over warehouse resources to satisfy business SLAs.
 Machine Learning Integration: By integrating Azure Machine Learning, Azure Synapse
Analytics enables leveraging ML capabilities. This can help predict and score the ML models to generate
191AIE503T CLOUD COMPUTING UNIT - IV
predictions within the data warehouse itself. Further, it allows converting the existing & trained ML
models into Synapse Analytics itself as opposed to recreating the entire model again. This helps businesses
to save time, money and effort.
Further, businesses can also analyze the data using machine learning algorithms and visualize the result
over a rich PowerBI Dashboard, how about that? Hence it is a great tool for companies to manage real-
time analytics for:
 Supply chain forecasting
 Inventory reporting
 Predictive maintenance
 Anomaly detection
Azure Synapse Analytics Benefits
Businesses of today use a variety of tools to manage, store and analyze workloads. And, you will agree,
things can go wrong when one of the interconnected systems faces a downtime or any other technical
challenge. Azure Synapse Analytics offers businesses centralized management of the data lakes and data
warehouses. Interestingly, Azure Synapse Studio offers a unified workspace for data preparation, data
management, data warehousing, Big Data & Artificial Intelligence tasks.
Here are some of the salient benefits;
Accelerate Analytics & Reporting
 Reduced manual efforts for collecting, collating and building reports
 Instant scalability & flexibility leads to no downtimes with workload variations
 Faster DWH deployment
Better BI & Data Visualization
 The seamless and native integration with Power BI makes the reporting & analysis of key metrics
that are engaging, easy-to-use. This makes it even easier to share with relevant stakeholders across
business streams – a word of advice: fewer data silos leads to more visibility.
Increased IT productivity
 Enables staff to automate infrastructure provisioning and administrative tasks (includes DWH setup,
patch management & maintenance)
191AIE503T CLOUD COMPUTING UNIT - IV
Limitless Scaling
 Being a cloud-based service, Azure Synapse Analytics can view, organize and queries (relational
& non-relational data) faster than traditional on-premises tools. In other words, you can efficiently
manage thousands of concurrent users and systems. And, when compared with Google’s BigQuery,
Synapse could run the same query in roughly 75% less time over a petabyte of data.
Azure HDInsight
Azure HDInsight is a service offered by Microsoft, that enables us to use open source frameworks for big
data analytics. Azure HDInsight allows the use of frameworks like Hadoop, Apache Spark, Apache Hive,
LLAP, Apache Kafka, Apache Storm, R, etc., for processing large volumes of data. These tools can be used
on data to perform extract, transform, and load (ETL,) data warehousing, machine learning, and IoT.
Azure HDInsight Features
The main features of Azure HDInsight that set it apart are:
 Cloud and on-premises availability: Azure HDInsight can help us in big data analytics using
Hadoop, Spark, interactive query (LLAP,) Kafka, Storm, etc., on the cloud as well as on-premises.
 Scalable and economical: HDInsight can be scaled up or down as and when required. The ability
to be scaled also means that you have to pay for only what you use. You can upgrade your HDInsight
when required, and this eliminates having to pay for unused resources.
 Security: Azure HDInsight protects your assets with industry-standard security. The encryption and
integration with Active Directory make sure that your assets are safe in the Azure Virtual Network.
 Monitoring and analytics: HDInsight’s integration with Azure Monitor helps us to closely watch
what is happening in our clusters and take actions based on that.
 Global availability: Azure HDInsight is more globally available than any other big data analytics
service.
 Highly productive: Productive tools for Hadoop and Spark can be used in HDInsight in different
development environments like Visual Studio, VSCode, Eclipse, and IntelliJ for Scala, Python, R,
Java, etc.
191AIE503T CLOUD COMPUTING UNIT - IV
Azure HDInsight Architecture
Before getting into the uses of Azure HDInsight, let’s understand how to choose the right Architecture for
Azure HDInsight. Listed below are best practices for Azure HDInsight Architecture:
 It is recommended that you migrate an on-premises Hadoop cluster to Azure HDInsight using
multiple workload clusters rather than a single cluster. A large number of clusters will increase your
costs unnecessarily if used over time.
 On-demand transient clusters are used so that the clusters are deleted after the workload is complete.
As a result, resource costs may be reduced since HDInsight clusters are rarely used. By deleting a
cluster, you will not be deleting the associated meta-stores or storage accounts, so you can use them
to recreate the cluster if necessary.
 In HDInsight clusters, as storage-and-compute can be used from Azure Storage, Azure Data Lake
Storage, or both, it is best to separate data storage from processing. In addition to reducing storage
costs, it will also allow you to use transient clusters, share data, scale storage, and compute
independently.
Azure HDInsight Metastore Best Practices
The Apache Hive Metastore is an important aspect of the Apache Hadoop architecture since it serves as a
central schema repository for other big data access resources including Apache Spark, Interactive Query
(LLAP), Presto, and Apache Pig. It is worth noting that HDInsight uses Azure SQL as its Hive metastore
database.
There are two types when it comes to HDInsight metastores: default metastores or custom
metastores.
 A default metastore can be created for free for any cluster type, but if one is created it cannot be
shared.
 The use of custom metastores is recommended for production clusters since they can be created
and removed without loss of metadata. It is suggested to use a custom metastore to isolate compute
and metadata and to periodically back it up.
HDInsight immediately deletes the Hive metastore upon cluster destruction. By storing Hive metastore in
Azure DB, you will not have to remove it when deleting the cluster.
191AIE503T CLOUD COMPUTING UNIT - IV
Azure Log Analysis and Azure Portal provide monitoring tools for monitoring metadata store performance.
If you are using HDInsight in the same region as your metastore, make sure that they are in the same
location.
Azure HDInsight Migration
The following are best practices for Azure HDInsight migration:
Script migration or replication can be used to migrate Hive metastore. You can migrate Hive metastore
with scripts by creating Hive DDLs from the existing metastore, editing the generated DDL to replace
HDFS URLs with WASB/ADLS/ABFS URLs, and then running the modified DDL on the metastore. Both
the on-premises and cloud versions of the metastore need to be compatible.
Migration Using DB Replication: When migrating your Hive metastores using DB replication, you can
use the Hive MetaTool to replace HDFS URLs with WASB/ADLS/ABFS URLs. Here’s an example code:
./hive --service metatool -updateLocation
hdfs://nn1:8020/
wasb://@.blob.core.windows.net/
Azure offers two approaches for migrating data from on-premises: migrating offline or migrating over TLS.
It will probably depend on how much data you need to migrate to determine the best choice for you.
Migrating over TLS: Microsoft Azure Storage Explorer, Azure Copy, Azure Powershell, and Azure CLI
can be used to migrate data over TLS to Azure storage.
Migrating offline: DataBox, DataBox Disk, and Data Box Heavy devices are also available for the offline
shipment of large amounts of data to Azure. As an alternative, you can also use native tools such as Apache
Hadoop DistCp, Azure Data Factory, or AzureCp to transfer data over the network.
Azure HDInsight Security and DevOps
To protect and maintain the cluster, it is wise to use Enterprise Security Package (ESP), which provides
directory-based authentication, multi-user assistance, and role-based access control. The ESP framework
can be used with a range of clusters, including Apache Hadoop, Apache Spark, Apache Hbase, Apache
Kafka, and Interactive Query (Hive LLAP).
To ensure your HDInsight deployment is secure, you need to take the following steps:
191AIE503T CLOUD COMPUTING UNIT - IV
Azure Monitor: Use the Azure Monitor service for monitoring and alerting.
Stay on top of updates: Always upgrade HDInsight to the latest version, install OS patches, and reboot
your nodes.
Enforce end-to-end enterprise security, with features such as auditing, encryption, authentication,
authorization, and a private pipeline.
Azure Storage Keys should also be encrypted. By using Shared Access Signatures (SAS), you can limit
access to your Azure storage resources. Azure Storage automatically encrypts data written to it using
Storage Service Encryption (SSE) and replication.
Make sure to update HDInsight at regular intervals. In order to do this, you can follow the steps
outlined below:
 Set up a new HDInsight cluster and apply the most recent update to HDInsight.
 Ensure the current cluster has enough workers and workloads.
 As needed, change applications, or workloads.
 A backup should be made of all temporary data stored on cluster nodes.
 Delete the existing cluster.
 Install HDInsight on a fresh new cluster with the same default data and metastore as previously.
 Import any temporary file backups.
 Finish processing jobs with the new cluster or start new ones.
Azure HDInsight Uses
The main scenarios in which we can use Azure HDInsight are:
Data Warehousing
Data warehousing is the storage of large volumes of data for retrieval and analysis at any point in time.
Data warehouses are maintained by businesses to analyze them and make strategic decisions based on them.
HDInsight can be used for data warehousing by performing queries at very large scales on structured or
unstructured data.
191AIE503T CLOUD COMPUTING UNIT - IV
Internet of Things (IoT)
We are surrounded by a large number of smart devices that make our life easier. These IoT-enabled devices
help us in taking off the task of making small decisions regarding our devices.
IoT requires the processing and analytics of data coming in from millions of smart devices. This data is the
backbone of IoT and maintaining and processing it is vital for the proper functioning of IoT-enabled
devices.
Azure HDInsight can help in processing large volumes of data coming from numerous devices.
191AIE503T CLOUD COMPUTING UNIT - IV
Data Science
Building applications that can analyze data and do tasks based on it are vital for AI-enabled solutions. These
apps need to be powerful enough to process large volumes of data and make decisions based on that.
An example worth noting would be the software used in self-driving cars. This software has to constantly
keep on learning from new experiences as well as from historical data to make real-time decisions.
Azure HDInsight helps in making applications that can extract vital information from analyzing large
volumes of data.
Preparing for job interviews? Have a look at our blog on Azure interview questions and answers!
Hybrid Cloud
A hybrid cloud is when companies use both public and private cloud for their workflows. In this, they will
get the benefits of both such as security, scalability, flexibility, etc.
Azure HDInsight can be used to extend a company’s on-premises infrastructure to the cloud for better
analytics and processing in a hybrid situation.
191AIE503T CLOUD COMPUTING UNIT - IV
Azure Data bricks
Databricks Introduction (https://intellipaat.com/blog/what-is-azure-databricks/#no9)
Databricks is a software company founded by the creators of Apache Spark. The company has also
created famous software such as Delta Lake, MLflow, and Koalas. These are the popular open-
source projects that span data engineering, data science, and machine learning. Databricks develops
web-based platforms for working with Spark, which provides automated cluster management and
IPython-style notebooks.
Databricks in Azure
191AIE503T CLOUD COMPUTING UNIT - IV
Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services
platform. Azure Databricks offers three environments:
 Databricks SQL
 Databricks data science and engineering
 Databricks machine learning
Databricks SQL
Databricks SQL provides a user-friendly platform. This helps analysts, who work on SQL queries,
to run queries on Azure Data Lake, create multiple virtualizations, and build and share dashboards.
Databricks Data Science and Engineering
Databricks data science and engineering provide an interactive working environment for data
engineers, data scientists, and machine learning engineers. The two ways to send data through the
big data pipeline are:
 Ingest into Azure through Azure Data Factory in batches
 Stream real-time by using Apache Kafka, Event Hubs, or IoT Hub
191AIE503T CLOUD COMPUTING UNIT - IV
Databricks Machine Learning
Databricks machine learning is a complete machine learning environment. It helps to manage
services for experiment tracking, model training, feature development, and management. It also
does model serving.
Enroll in our Azure training in Bangalore, if you are interested in getting an AZ-400
certification.
Pros and Cons of Azure Databricks
Moving ahead in this blog, we will discuss the pros and cons of Azure Databricks and understand
how good it really is.
Pros
 It can process large amounts of data with Databricks and since it is part of Azure; the data is cloud-
native.
 The clusters are easy to set up and configure.
 It has an Azure Synapse Analytics connector as well as the ability to connect to Azure DB.
 It is integrated with Active Directory.
 It supports multiple languages. Scala is the main language, but it also works well with Python, SQL,
and R.
Cons
 It does not integrate with Git or any other versioning tool.
 It, currently, only supports HDInsight and not Azure Batch or AZTK.
Databricks SQL
Databricks SQL allows you to run quick ad-hoc SQL queries on Data Lake. Integrating with Azure Active
Directory enables to run of complete Azure-based solutions by using Databricks SQL. By integrating with
Azure databases, Databricks SQL can store Synapse Analytics, Azure Cosmos DB, Data Lake Store, and
Blob Storage. Integrating with Power BI, Databricks SQL allows users to discover and share insights more
easily. BI tools, such as Tableau Software, can also be used for accessing data bricks.
191AIE503T CLOUD COMPUTING UNIT - IV
The interface that allows the automation of Databricks SQL objects is REST API.
Data Management
It has three parts:
 Visualization: A graphical presentation of the result of running a query
 Dashboard: A presentation of query visualizations and commentary
 Alert: A notification that a field returned by a query has reached a threshold
Computation Management
Here, we will know about the terms that will help to run SQL queries in Databricks SQL.
 Query: A valid SQL statement
 SQL endpoint: A resource where SQL queries are executed
 Query history: A list of previously executed queries and their characteristics
Opt for Microsoft Azure Training taught by industry experts and get certified!
Authorization
 User and group: The user is an individual who has access to the system. The set of multiple users
is known as a group.
 Personal access token: An opaque string is used to authenticate to the REST API.
 Access control list: Set of permissions attached to a principal that requires access to an object. ACL
(Access Control List) specifies the object and actions allowed in it.
Databricks Data Science & Engineering
Databricks Data Science & Engineering is, sometimes, also called Workspace. It is an analytics platform
that is based on Apache Spark.
191AIE503T CLOUD COMPUTING UNIT - IV
Databricks Data Science & Engineering comprises complete open-source Apache Spark cluster
technologies and capabilities. Spark in Databricks Data Science & Engineering includes the following
components:
 Spark SQL and DataFrames: This is the Spark module for working with structured data. A
DataFrame is a distributed collection of data that is organized into named columns. It is very similar
to a table in a relational database or a data frame in R or Python.
 Streaming: This integrates with HDFS, Flume, and Kafka. Streaming is real-time data processing
and analysis for analytical and interactive applications.
 MLlib: It is short for Machine Learning Library consisting of common learning algorithms and
utilities including classification, regression, clustering, collaborative filtering, dimensionality
reduction as well as underlying optimization primitives.
 GraphX: Graphs and graph computation for a broad scope of use cases from cognitive analytics to
data exploration.
 Spark Core API: This has the support for R, SQL, Python, Scala, and Java.
Integrating with Azure Active Directory enables you to run complete Azure-based solutions by using
Databricks SQL. By integrating with Azure databases, Databricks SQL can store Synapse Analytics,
Cosmos DB, Data Lake Store, and Blob Storage. By integrating with Power BI, Databricks SQL allows
users to discover and share insights more easily. BI tools, such as Tableau Software, can also be used.
Workspace
Workspace is the place for accessing all Azure Databricks assets. It organizes objects into folders and
provides access to data objects and computational resources.
The workspace contains:
 Dashboard: It provides access to visualizations.
 Library: Package available to notebook or job running on the cluster. We can also add our own
libraries.
 Repo: A folder whose contents are co-versioned together by syncing them to a local Git repository.
 Experiment: A collection of MLflow runs for training an ML model.
191AIE503T CLOUD COMPUTING UNIT - IV
Interface
It supports UI, API, and command line (CLI.)
 UI: It provides a user-friendly interface to workspace folders and their resources.
 Rest API: There are two versions, REST API 2.0 and REST API 1.2. REST API 2.0 has features
of REST API 1.2 along with some additional features. So, REST API 2.0 is the preferred version.
 CLI: It is an open-source project that is available on GitHub. CLI is built on REST API 2.0.
Data Management
 Databricks File System (DBFS): It is an abstraction layer over the Blob store. It contains
directories that can contain files or more directories.
 Database: It is a collection of information that can be managed and updated.
 Table: Tables can be queried with Apache Spark SQL and Apache Spark APIs.
 Metastore: It stores information about various tables and partitions in the data warehouse.
To learn more, have a look at our blog on Azure tutorial now!
Computation Management
To run computations in Azure Databricks, we need to know about the following:
 Cluster: It is a set of computation resources and configurations on which we can run notebooks and
jobs. These are of two types:
o All-purpose: We create an all-purpose cluster by using UI, CLI, or REST API. We can
manually terminate and restart an all-purpose cluster. Multiple users can share such clusters
to do collaborative, interactive analysis.
o Job: The Azure Databricks job scheduler creates a job cluster when we run a job on a new
job cluster and terminates the cluster when the job is complete. We cannot restart a job
cluster.
 Pool: It has a set of ready-to-use instances that reduce cluster start. It also reduces auto-scaling time.
If the pool does not have enough resources, it expands itself. When the attached cluster is terminated,
the instances it uses are returned to the pool and can be reused by a different cluster.
191AIE503T CLOUD COMPUTING UNIT - IV
Databricks Runtime
The core components that run on clusters managed by Azure Databricks offer several runtimes:
 It includes Apache Spark but also adds numerous other features to improve big data analytics.
 Databricks Runtime for machine learning is built on Databricks runtime and provides a ready
environment for machine learning and data science.
 Databricks Runtime for genomics is a version of Databricks runtime that is optimized for working
with genomic and biomedical data.
 Databricks Light is the Azure Databricks packaging of the open-source Apache Spark runtime.
Job
 Workload: There are two types of workloads with respect to the pricing schemes:
o Data engineering workload: This workload works on a job cluster.
o Data analytics workload: This workload runs on an all-purpose cluster.
 Execution context: It is the state of a REPL environment. It supports Python, R, Scala, and SQL.
Model Management
The concepts that are needed to know how to build machine learning models are:
 Model: This is a mathematical function that represents the relation between inputs and outputs.
Machine learning consists of training and inference steps. We can train a model by using an existing
data set and using that to predict the outcomes of new data.
 Run: It is a collection of parameters, metrics, and tags that are related to training a machine learning
model.
 Experiment: It is the primary unit of organization and access control for runs. All MLflow runs
belong to the experiment.
Authentication and Authorization
 User and group: A user is an individual who has access to the system. A set of users is a group.
 Access control list: Access control list (ACL) is a set of permissions that are attached to a principal,
which requires access to an object. ACL specifies the object and the actions allowed on it.
Look at Azure Interview Questions and take a bigger step toward building your career.
191AIE503T CLOUD COMPUTING UNIT - IV
Databricks Machine Learning
Databricks machine learning is an integrated end-to-end machine learning platform incorporating managed
services for experiment tracking, model training, feature development and management, and feature and
model serving. Databricks machine learning automates the creation of a cluster that is optimized for
machine learning. Databricks Runtime ML clusters include the most popular machine learning libraries
such as TensorFlow, PyTorch, Keras, and XGBoost. It also includes libraries, such as Horovod, that are
required for distributed training.
With Databricks machine learning, we can:
 Train models either manually or with AutoML
 Track training parameters and models by using experiments with MLflow tracking
 Create feature tables and access them for model training and inference
 Share, manage, and serve models by using Model Registry
We also have access to all of the capabilities of Azure Databricks workspace such as notebooks, clusters,
jobs, data, Delta tables, security and admin controls, and many more.
When to use Databricks
1. Modernize your Data Lake – if you are facing challenges around performance and reliability in your
data lake, or your data lake has become a data swamp, consider Delta as an option to modernize
your Data Lake.
2. Production Machine Learning – if your organization is doing data science work but is having trouble
getting that work into the hands of business users, the Databricks platform was built to enable data
scientists from getting their work from Development to Production.
3. Big Data ETL – from a cost/performance perspective, Databricks is best in its class.
4. Opening your Data Lake to BI users – If your analyst / BI group is consistently slowed down by the
major lift of the engineering team having to build a pipeline every time they want to access new
data, in might make sense to open the Data Lake to these users through a tool like SQL Analytics
within Databricks.
When not to use Databricks
There are a few scenarios when using Databricks is probably not the best fit for your use case:
1. Sub-second queries – Spark, being a distributed engine, has overhead involved in processing that
make it nearly impossible to get sub-second queries. Your data can still live in the data lake, but for
sub-second queries you will likely want to use a highly tuned speed layer.
2. Small data – Similar to the first point, you won't get the majority of the benefits of Databricks if you
are dealing with very small data (think GBs).
191AIE503T CLOUD COMPUTING UNIT - IV
3. Pure BI without a supporting data engineering team – Databricks and SQL Analytics does not erase
the need for a data engineering team – in fact, they are more critical than ever in unlocking the
potential of the Data Lake. That said, Databricks offers tools to enable the data engineering team
itself.
4. Teams requiring drag and drop ETL – Databricks has many UI components but drag and drop code
is not currently one of them.
Usage of Internet of Things (IoT) Hub
Azure IoT hub allows you to get on with developing cool IoT stuff, and not worry about how it all gets
connected up and managed.
Internet of Things (IoT) offers businesses immediate and real-world opportunities to reduce costs, to
increase revenue, as well as transforming their businesses. Azure IoT hub is a managed IoT service which
is hosted in the cloud. It allows bi-directional communication between IoT applications and the devices it
manages. This cloud-to-device connectivity means that you can receive data from your devices, but you
can also send commands and policies back to the devices. How Azure IoT hub differs from the existing
solutions is that it also provides the infrastructure to authenticate, connect and manage the devices
connected to it.
191AIE503T CLOUD COMPUTING UNIT - IV
Azure IoT Hub allows full-featured and scalable IoT solutions. Virtually, any device can be connected
to Azure IoT Hub and it can scale up to millions of devices. Events can be tracked and monitored, such
as the creation, failure, and connection of devices.
Azure IoT Hub provides,
 Device libraries for the most commonly used platforms and languages for easy device connectivity.
 Secure communications with multiple options for device-to-cloud and cloud-to-device hyper-scale
communication.
 Queryable storage of per-device state information as well as meta-data.
Managing devices with IoT Hub
The needs and requirements of IoT operators vary substantially in different industries, from transport
to manufacturing to agriculture to utilities. There is also a wide variation in the types of devices used
by IoT operators. IoT Hub is able to provide the capabilities, patterns and code libraries to allow
developers to build management solutions that can manage very diverse sets of devices.
Configuring and controlling devices
Devices which are connected to IoT Hub can be managed using an array of built-in functionality. This
means that-
 Device metadata and state information for all your devices can be stored, synchronized and queried.
 Device state can be set either per-device or in groups depending on common characteristics of the
devices.
 A state change in a device can be automatically responded to by using message routing integration.
The lifecycle of devices with IoT Hub
 Plan
Operators can create a device metadata scheme that allows them to easily carry out bulk management
191AIE503T CLOUD COMPUTING UNIT - IV
operations.
 Provision
New devices can be securely provisioned to IoT Hub and operators can quickly discover device
capabilities. The IoT Hub identity registry is used to create device identities and credentials.
 Configure
Device management operations, such as configuration changes and firmware updates can be done in
bulk or by direct methods, while still maintaining system security.
 Monitor
Operators can be easily alerted to any issues arising and at the same time the device collection health
can be monitored, as well as the status of any ongoing operations.
 Retire
Devices need to be replaced, retired or decommissioned. The IoT Hub identity registry is used to
withdraw device identities and credentials.
Device management patterns
IoT Hub supports a range of device management patterns including,
 Reboot
 Factory reset
 Configuration
 Firmware update
 Reporting progress and status
These patterns can be extended to fit your exact situation. Alternatively, new patterns can be designed based
on these templates.
Connecting your devices
You can build applications which run on your devices and interact with IoT Hub using the Azure IoT device
SDK. Windows, Linux distributions, and real-time operating systems are supported platforms. Supported
languages currently include,
 C
 C#
 Java
 Python
 Node.js.
Messaging Patterns
Azure IoT Hub supports a range of messaging patterns including,
 Device to cloud telemetry
 File upload from devices
 Request-reply methods which enable devices to be controlled from the cloud
Message routing and event grid
191AIE503T CLOUD COMPUTING UNIT - IV
Both IoT Hub message routing and IoT Hub integration with Event Grid makes it possible to stream data
from your connected devices. However, there are differences. Message routing allows users to route device-
to-cloud messages to a range of supported service endpoints such as Event Hubs and Azure Storage
containers while IoT Hub integration with Event Grid is a fully managed routing service which can be
extended into third-party business applications.
Device data can be routed
In Azure IoT Hub, the message routing functionality is built in. This allows you to set up automatic rules-
based message fan-out. You can use message routing to decide where your hub sends your devices’
telemetry. Routing messages to multiple endpoints don’t incur any extra costs.
Building end-to-end solutions
End-to-end solutions can be built by integrating IoT Hub with other Azure services. For example,
 Business processes can be automated using Azure Logic Apps.
 You can run analytic computations in real-time on the data from your devices using Azure Stream
Analytics.
 AI models and machine learning can be added using Azure Machine Learning.
 You can respond rapidly to critical events with Azure Event Grid.
Azure IoT Hub or Azure Event Hub?
Both Azure IoT Hub and Azure Event Hub are cloud services which can ingest, process and store large
amounts of data. However, they were designed with different purposes in mind. Event Hub was developed
for big data streaming while IoT Hub was designed specifically to connect IoT devices at scale to the Azure
Cloud. Therefore, which one you choose to use will depend on the demands of your business.
Security
Businesses face security, privacy, and compliance challenges which are unique to the IoT. Security for IoT
solutions means that devices need to be securely provisioned and there needs to be secure connectivity
between the devices and the cloud, as well as secure data protection in the cloud during processing and
storage.
191AIE503T CLOUD COMPUTING UNIT - IV
IoT Hub allows data to be sent on secure communications channels. Each device connects securely to the
hub and each device can be managed securely. You can control access at the per-device level and devices
are automatically provisioned to the correct hub when the device first boots up.
There’s also a range of different types of authentication depending on device capabilities, including SAS
SAS token-based authentication, individual X.509 certificate authentication for secure, standards-based
authentication, as well as X.509 CA authentication.
High Availability and Disaster Recovery
Uptime goals vary from business to business. Azure IoT Hub offers three main High Availability (HA) and
Disaster Recovery (DR) features including:
 Intra-region HA
The IoT Hub service provides intra-region HA by implementing redundancies in almost all layers of
the service. The SLA published by the IoT Hub service is achieved by making use of these
redundancies and are available automatically to developers. However, transient failures should be
expected when using cloud computing; therefore, appropriate retry policies need to be built into
components which interact with the cloud in order to deal with these transient failures.
 Cross region DR
Situations may arise when a datacentre suffers from extended outages or some other physical
failure. It is rare but possible that intra-region HA capability may not be able to help in some of
these situations. However, IoT Hub has a number of possible solutions for recovering from extended
outages or physical failures. In these situations, a customer can have a Microsoft initiated failover or
a manual failover.
 Both of these options offer the following recovery time objectives (RTO),
 Achieving cross region HA
191AIE503T CLOUD COMPUTING UNIT - IV
 If the RTOs provided by either the Microsoft initiated failover or manual failover aren’t sufficient
for your uptime goals, then another option is to implement a per-device automatic cross region
failover mechanism. In this model, the IoT solution runs in a primary and secondary datacentre in
two different locations. If there’s an outage or a loss of network connectivity in the primary region,
the devices can use the secondary location.
 Choosing the right IoT Hub tier
 Azure IoT hub offers two tiers, basic and standard. The basic tier which is uni-directional from
devices to the cloud is more suitable if the data is going to be gathered from devices and analyzed
centrally. However, if you want bi-directional communication, enabling you to, for example, control
devices remotely, then the standard tier is more appropriate. Both tiers have the same security and
authentication features.
 Each tier has three different sizes (1, 2 and 3), depending on how much data they can handle in a
day. For instance, a level 3 unit can handle 300 million messages a day while a level 1 unit can
handle 400,000.
IoT Central:
https://www.thethingsindustries.com/docs/integrations/cloud-integrations/azure-iot-central/device-
templates/
IoT Central is an IoT application platform as a service (aPaaS) that reduces the burden and cost of
developing, managing, and maintaining enterprise-grade IoT solutions. If you choose to build with IoT
Central, you'll have the opportunity to focus time, money, and energy on transforming your business
with IoT data, rather than just maintaining and updating a complex and continually evolving IoT
infrastructure.
191AIE503T CLOUD COMPUTING UNIT - IV
The web UI lets you quickly connect devices, monitor device conditions, create rules, and manage
millions of devices and their data throughout their life cycle. Furthermore, it enables you to act on device
insights by extending IoT intelligence into line-of-business applications.
The key features of the Azure IoT Hub integration are:
 Handling uplink messages: The Things Stack publishes uplink messages to an Azure IoT Central
Application
 Automatic device provisioning: end devices are automatically created into the Azure IoT Central
Application, using the LoRaWAN device repository information in order to provision the end device
template
 Updating device state in Device Twin: update the device reported properties based on the decoded
payloads, and schedule downlinks based on the device desired properties
Architecture
The Azure IoT Central integration does not require any additional physical resources in your Azure
account. It connects to the Azure IoT Central Application using the underlying Azure IoT Device
Provisioning Service, then submits traffic using the Azure IoT Hub in which the application has been
provisioned.
191AIE503T CLOUD COMPUTING UNIT - IV
The single resource deployed in your Azure Account is the Azure IoT Central Application. All
permissions are the minimum permissions for the integration to function.
Implementation details
Azure IoT Hub is designed around standalone end devices communicating directly with the hub. Each end
device must connect to the hub via one of the supported communication protocols (MQTT / AMQP). These
protocols are inherently stateful - each individual end device must have one connection always open in
order to send and receive messages from the Azure IoT Hub.
LoRaWAN end devices are in general low power, low resources devices with distinct traffic patterns.
Communication in the LoRaWAN world also does not have the concept of a connection, in the TCP sense,
but instead focuses on a communication session. Downlink traffic, which would map to IoT Hub cloud-to-
device messages, occurs rarely at application layer for most use cases. As such, keeping a connection open
per end device is both wasteful and hard to scale, as both communication protocols mentioned above
enforce that each end device has its own individual connection, and no subscription groups semantics are
available.
Based on the above arguments, the Azure IoT Central integration prefers to use an asynchronous, stateless
communication style. When uplink messages are received from an end device, the integration connects on
demand to the Azure IoT Hub and submits the message, and also updates the Device Twin. The data plane
protocol used between The Things Stack and Azure IoT Hub is MQTT, and the connections are always
secure using TLS 1.2.
191AIE503T CLOUD COMPUTING UNIT - IV
Device Twin desired properties updates and device creation or deletion events are received by The Things
Stack using an IoT Central Data Export. The Data Export submits the data via HTTP requests which are
authenticated using the API key provided during the integration provisioning, and connections are always
done over TLS. This pipeline allows The Things Stack to avoid long running connections to the Azure IoT
Hub.
Azure Sphere
Microsoft’s website states that “Azure Sphere is a solution for creating highly secured, connected
Microcontroller (MCU) devices” (source). But it is not just about MCU, of course.
The solution also includes an operating system and an application platform. This provides product
manufacturers with a chance to create secured, internet-connected devices that can be controlled,
updated, monitored and maintained remotely.
Azure Sphere is a secured, high-level application platform with built-in communication and security
features for internet-connected devices. It comprises a secured, connected, crossover microcontroller unit
(MCU), a custom high-level Linux-based operating system (OS), and a cloud-based security service that
provides continuous, renewable security.
The Azure Sphere MCU integrates real-time processing capabilities with the ability to run a high-level
operating system. An Azure Sphere MCU, along with its operating system and application platform,
enables the creation of secured, internet-connected devices that can be updated, controlled, monitored,
and maintained remotely. A connected device that includes an Azure Sphere MCU, either alongside or in
place of an existing MCUs, provides enhanced security, productivity, and opportunity. For example:
 A secured application environment, authenticated connections, and opt-in use of peripherals
minimizes security risks due to spoofing, rogue software, or denial-of-service attacks, among
others.
 Software updates can be automatically deployed from the cloud to any connected device to fix
problems, provide new functionality, or counter emerging methods of attack, thus enhancing the
productivity of support personnel.
 Product usage data can be reported to the cloud over a secured connection to help in diagnosing
problems and designing new products, thus increasing the opportunity for product service, positive
customer interactions, and future development.
The Azure Sphere Security Service is an integral aspect of Azure Sphere. Using this service, Azure
Sphere MCUs safely and securely connect to the cloud and web. The service ensures that the device boots
only with an authorized version of genuine, approved software. In addition, it provides a secured channel
through which Microsoft can automatically download and install OS updates to deployed devices in the
field to mitigate security problems. Neither manufacturer nor end-user intervention is required, thus
closing a common security hole.
191AIE503T CLOUD COMPUTING UNIT - IV
Azure Sphere consists of three main parts:
Secured Micro-controller Unit (MCU)
The first part is a crossover class of MCU with built-in Microsoft security technology and connectivity.
Each Azure Sphere MCU includes a wireless communications subsystem that facilitates an internet
connection.
It is worth mentioning that the Sphere’s MCU provides a kind of a hardware firewall or “sandbox” that
ensures that only certain I/O peripherals are accessible to the core to which they are mapped. Consequently,
you cannot connect any sensors without first declaring them.
The application processor also features an ARM Cortex-A subsystem, responsible for executing the
operating system, applications and services. It supports two operating environments:
 Normal World (NW) – executes code in both user mode and supervisor mode
 Secure World (SW) – executes only the Microsoft-supplied Security Monitor.
Secured OS
The second component is a highly-secured OS from Microsoft with a custom kernel running on top of
Microsoft’s Security Monitor. This creates a trustworthy defense in depth platform.
The purpose of the OS services is two-fold: to host the application container, and to facilitate the
communication with the Azure Sphere Security Service described further. These services manage Wi-Fi
authentication, including network firewall for all outbound traffic.
191AIE503T CLOUD COMPUTING UNIT - IV
Cloud Security
The Azure Sphere Security Service guards every Azure Sphere device by renewing security, identifying
emerging threats, and brokering trust among devices and the cloud. It also provides certificate-based
authentication. Additionally, the remote attestation service connects with the device to test if it booted
with the correct software, including its version.
Furthermore, the Security Service distributes automatic updates for all Microsoft-supplied Azure Sphere
OS and OEM software. As a result, manufacturers can securely update their devices remotely without
having to worry about whether any update is falsified.
Finally, there is a small crash-reporting module which provides crash reporting for deployed software.
How does Azure Sphere work in practice?
You might wonder how to use Azure Sphere in a real-life scenario. Let’s say that our company, Predica,
is a manufacturer of washing machines.
In our example, Predica provides high-class, intelligent washing machines that users can remotely
control from a mobile app. Each washing machine has an embedded Azure Sphere MCU.
Predica has a software development team responsible for developing both software for the washing
machines, as well as the mobile application. There is also a support team responsible for maintenance
and detection of potential errors.
Take a look at the diagram below that visualizes the scenario:
191AIE503T CLOUD COMPUTING UNIT - IV
As you can see, there are three main parties in the network:
191AIE503T CLOUD COMPUTING UNIT - IV
 Microsoft – handles the security aspect. The Azure Sphere Security Service is used to send system
updates automatically, so Predica as the manufacturer does not have to worry about them
 Predica software team – develops and releases revisions of software for the washing machines,
which is uploaded to the devices using Microsoft Azure cloud services
 Predica support team – responsible for maintenance, checking the system and application versions
on each washer, as well as detecting possible issues.
Azure Sphere provides a way to monitor and control all devices in a secured and centralized way. This is the
real power of this solution.
How to begin your journey with Azure Sphere?
The Azure Sphere Development Board (hardware) is already available to you. You can order it from
the Seeed Studio online store. However, once you receive the board, there a few additional things that you
will need in to get started:
 Visual Studio 2017 IDE – Enterprise, Professional or Community, version 15.7 or later
 A PC running Windows 10 Anniversary Update or later
 Azure Sphere SDK Preview for Visual Studio
 An unused USB port on the PC.
It is important to note that at this time the tools for Azure Sphere are still in preview. You do not require a
Microsoft Azure cloud subscription to use Azure Sphere and start development.
Azure Sphere and the seven properties of highly secured devices
A primary goal of the Azure Sphere platform is to provide high-value security at a low cost, so that price-
sensitive, microcontroller-powered devices can safely and reliably connect to the internet. As network-
connected toys, appliances, and other consumer devices become commonplace, security is of utmost
importance. Not only must the device hardware itself be secured, its software and its cloud connections
must also be secured. A security lapse anywhere in the operating environment threatens the entire product
and, potentially, anything or anyone nearby.
Based on Microsoft's decades of experience with internet security, the Azure Sphere team has
identified seven properties of highly secured devices. The Azure Sphere platform is designed around these
seven properties:
Hardware-based root of trust. A hardware-based root of trust ensures that the device and its identity
cannot be separated, thus preventing device forgery or spoofing. Every Azure Sphere MCU is identified by
an unforgeable cryptographic key that is generated and protected by the Microsoft-designed Pluton security
subsystem hardware. This ensures a tamper-resistant, secured hardware root of trust from factory to end
user.
Defense in depth. Defense in depth provides for multiple layers of security and thus multiple mitigations
against each threat. Each layer of software in the Azure Sphere platform verifies that the layer above it is
secured.
191AIE503T CLOUD COMPUTING UNIT - IV
Small trusted computing base. Most of the device's software remains outside the trusted computing base,
thus reducing the surface area for attacks. Only the secured Security Monitor, Pluton runtime, and Pluton
subsystem—all of which Microsoft provides—run on the trusted computing base.
Dynamic compartments. Dynamic compartments limit the reach of any single error. Azure Sphere MCUs
contain silicon counter-measures, including hardware firewalls, to prevent a security breach in one
component from propagating to other components. A constrained, "sandboxed" runtime environment
prevents applications from corrupting secured code or data.
Password-less authentication. The use of signed certificates, validated by an unforgeable cryptographic
key, provides much stronger authentication than passwords. The Azure Sphere platform requires every
software element to be signed. Device-to-cloud and cloud-to-device communications require further
authentication, which is achieved with certificates.
Error reporting. Errors in device software or hardware are typical in emerging security attacks; errors that
result in device failure constitute a denial-of-service attack. Device-to-cloud communication provides early
warning of potential errors. Azure Sphere devices can automatically report operational data and errors to a
cloud-based analysis system, and updates and servicing can be performed remotely.
Renewable security. The device software is automatically updated to correct known vulnerabilities or
security breaches, requiring no intervention from the product manufacturer or the end user. The Azure
Sphere Security Service updates the Azure Sphere OS and your applications automatically.
Azure Sphere architecture
Working together, the Azure Sphere hardware, software, and Security Service enable unique, integrated
approaches to device maintenance, control, and security.
The hardware architecture provides a fundamentally secured computing base for connected devices,
allowing you to focus on your product.
The software architecture, with a secured custom OS kernel running atop the Microsoft-written Security
Monitor, similarly enables you to concentrate your software efforts on value-added IoT and device-specific
features.
The Azure Sphere Security Service supports authentication, software updates, and error reporting over
secured cloud-to-device and device-to-cloud channels. The result is a secured communications
infrastructure that ensures that your products are running the most up-to-date Azure Sphere OS. For
architecture diagrams and examples of cloud architectures, see Browse Azure Architectures.
Hardware architecture
An Azure Sphere crossover MCU consists of multiple cores on a single die, as the following figure shows.
Azure Sphere MCU hardware architecture
Each core, and its associated subsystem, is in a different trust domain. The root of trust resides in the Pluton
security subsystem. Each layer of the architecture assumes that the layer above it may be compromised.
Within each layer, resource isolation and dynamic compartments provide added security.
191AIE503T CLOUD COMPUTING UNIT - IV
Microsoft Pluton security subsystem
The Pluton security subsystem is the hardware-based (in silicon) secured root of trust for Azure Sphere. It
includes a security processor core, cryptographic engines, a hardware random number generator,
public/private key generation, asymmetric and symmetric encryption, support for elliptic curve digital
signature algorithm (ECDSA) verification for secured boot, and measured boot in silicon to support remote
attestation with a cloud service, as well as various tampering counter-measures including an entropy
detection unit.
As part of the secured boot process, the Pluton subsystem boots various software components. It also
provides runtime services, processes requests from other components of the device, and manages critical
components for other parts of the device.
High-level application core
The high-level application core features an ARM Cortex-A subsystem that has a full memory management
unit (MMU). It enables hardware-based compartmentalization of processes by using trust zone functionality
and is responsible for running the operating system, high-level applications, and services. It supports two
operating environments: Normal World (NW), which runs code in both user mode and supervisor mode,
and Secure World (SW), which runs only the Microsoft-supplied Security Monitor. Your high-level
applications run in NW user mode.
Real-time cores
The real-time cores feature an ARM Cortex-M I/O subsystem that can run real-time capable applications
as either bare-metal code or a real-time operating system (RTOS). Such applications can map peripherals
and communicate with high-level applications but cannot access the internet directly.
Connectivity and communications
The first Azure Sphere MCU provides an 802.11 b/g/n Wi-Fi radio that operates at both 2.4GHz and 5GHz.
High-level applications can configure, use, and query the wireless communications subsystem, but they
cannot program it directly. In addition to or instead of using Wi-Fi, Azure Sphere devices that are properly
equipped can communicate on an Ethernet network.
Multiplexed I/O
The Azure Sphere platform supports a variety of I/O capabilities, so that you can configure embedded
devices to suit your market and product requirements. I/O peripherals can be mapped to either the high-
level application core or to a real-time core.
Microsoft firewalls
Hardware firewalls are silicon countermeasures that provide "sandbox" protection to ensure that I/O
peripherals are accessible only to the core to which they are mapped. The firewalls impose
compartmentalization, thus preventing a security threat that is localized in the high-level application core
from affecting the real-time cores' access to their peripherals.
191AIE503T CLOUD COMPUTING UNIT - IV
Integrated RAM and flash
Azure Sphere MCUs include a minimum of 4MB of integrated RAM and 16MB of integrated flash memory.
Software architecture and OS
The high-level application platform runs the Azure Sphere OS along with a device-specific high-level
application that can communicate both with the internet and with real-time capable applications that run on
the real-time cores. The following figure shows the elements of this platform.
Microsoft-supplied elements are shown in gray.
High-level Application Platform
Microsoft provides and maintains all software other than your device-specific applications. All software
that runs on the device, including the high-level application, is signed by the Microsoft certificate authority
(CA). Application updates are delivered through the trusted Microsoft pipeline, and the compatibility of
each update with the Azure Sphere device hardware is verified before installation.
Application runtime
The Microsoft-provided application runtime is based on a subset of the POSIX standard. It consists of
libraries and runtime services that run in NW user mode. This environment supports the high-level
applications that you create.
Application libraries support networking, storage, and communications features that are required by high-
level applications but do not support direct generic file I/O or shell access, among other constraints. These
restrictions ensure that the platform remains secured and that Microsoft can provide security and
maintenance updates. In addition, the constrained libraries provide a long-term stable API surface so that
system software can be updated to enhance security while retaining binary compatibility for applications.
OS services
OS services host the high-level application container and are responsible for communicating with the Azure
Sphere Security Service. They manage network authentication and the network firewall for all outbound
traffic. During development, OS services also communicate with a connected PC and the application that
is being debugged.
Custom Linux kernel
The custom Linux-based kernel runs in supervisor mode, along with a boot loader. The kernel is carefully
tuned for the flash and RAM footprint of the Azure Sphere MCU. It provides a surface for preemptable
execution of user-space processes in separate virtual address spaces. The driver model exposes MCU
peripherals to OS services and applications. Azure Sphere drivers include Wi-Fi (which includes a TCP/IP
networking stack), UART, SPI, I2C, and GPIO, among others.
191AIE503T CLOUD COMPUTING UNIT - IV
Security Monitor
The Microsoft-supplied Security Monitor runs in SW. It is responsible for protecting security-sensitive
hardware, such as memory, flash, and other shared MCU resources and for safely exposing limited access
to these resources. The Security Monitor brokers and gates access to the Pluton Security Subsystem and the
hardware root of trust and acts as a watchdog for the NW environment. It starts the boot loader, exposes
runtime services to NW, and manages hardware firewalls and other silicon components that are not
accessible to NW.
Azure Sphere Security Service
The Azure Sphere Security Service comprises three components: password-less authentication, update, and
error reporting.
 Password-less authentication. The authentication component provides remote attestation and
password-less authentication. The remote attestation service connects via a challenge-response
protocol that uses the measured boot feature on the Pluton subsystem. It verifies not merely that the
device booted with the correct software, but with the correct version of that software.
After attestation succeeds, the authentication service takes over. The authentication service
communicates over a secured TLS connection and issues a certificate that the device can present to a
web service, such as Microsoft Azure or a company's private cloud. The web service validates the
certificate chain, thus verifying that the device is genuine, that its software is up to date, and that
Microsoft is its source. The device can then connect safely and securely with the online service.
 Update. The update service distributes automatic updates for the Azure Sphere OS and for
applications. The update service ensures continued operation and enables the remote servicing and
update of application software.
 Error reporting. The error reporting service provides simple crash reporting for deployed software.
To obtain richer data, use the reporting and analysis features that are included with a Microsoft Azure
subscription.
Azure Cloud shell and Mobile Apps
Azure Cloud Shell is another service under the Microsoft banner that enables you to have a Bash or
PowerShell console without changing your browser. Since the service is browser-based, there's no problem
about having a local setup run for the two platforms. Azure Cloud Shell is basically what the cloud is
integrated with. There's no point in worrying about the underlying infrastructure if your only focus is on
the console. With Azure Cloud Shell, the key is to develop and manage Azure resources in a friendlier
environment. The service offers a pre-configured, browser-accessible shell experience to take care of the
Azure resources without incurring an additional cost of machine maintenance, versioning, and installation.
And since the whole idea is to provide interactive sessions through Cloud Shell, the machine that works on
a per-request basis automatically terminates the activity if left idle for 20 minutes. The latest upgrades
enable Azure Cloud Shell to run on Ubuntu 16.04 LTS.
Getting Started with Azure Cloud Shell
191AIE503T CLOUD COMPUTING UNIT - IV
The Cloud Shell service can be used within the Azure Container service based on your subscription type.
Not every subscriber have to pay for the storage account separately. If your subscription allows, it can be
created and associated with the current package.
Also, the storage account is tied to the Cloud Shell and can be used right away. The container is mounted
under the PowerShell user profile. In short, the Azure Cloud Shell is your Microsoft-managed admin
machine in Azure which enables you to:
 Get authentic virtual access to Azure Shell from anywhere in the world
 Use common programming languages and tools in a Shell that's maintained and updated by
Microsoft
 Persists your data files across sessions in Azure files
With Azure, you have the flexibility to choose according to the preferred shell experience that perfectly
matches the way you work. Both PowerShell and Bash experiences are available.
Microsoft Azure Cloud Shell Important Features
Here are the top most important features associated with Azure Cloud Shell:
Automatic Authentication for Improved Security
Cloud shell automatically and securely authenticates account access for PowerShell and Azure CLI. This
means that the interactive session will terminate if the shell inactivity persists for more than 20 minutes.
This automatic feature help improves security.
Persistence Across Sessions
To help the user with a stick with the files across sessions, you get a walk through with Cloud Shell, that
instantly attaches on Azure file share right on the launch. After the session is completed, the Cloud Shell
will attach itself to your storage and persist for all the sessions in the future. Moreover, your home directory
is saved as a .img file in your Azure File share. The files that are outside of the machine state or home
directory are not persisted across sessions. It is best to refer to the best practices for Cloud Shell for storing
secrets like the SSH keys.
Virtual Access from Anywhere
The service allows you to connect to Azure platform using a browser-based, authenticated shell experience
that is hosted in the cloud and can be accessed from anywhere. The Cloud Shell service can be utilized by
a unique user as per the automatic assignment. The user account is then authenticated for each session for
increased security. To enjoy a modern CLI experience using multiple access points - including Azure
mobile app, shell.azure.com, Azure docs (such as Azure PowerShell, Azure CLI), Azure portal, and VS
Code Azure Account Extension.
Common Programming Languages and Tools
Just like any other component of the Microsoft, the platform regularly updates and maintains the Cloud
Shell. The browser-based service naturally comes common CLI tools, which include PowerShell modules,
Linux Shell interpreters, source control, text editors, Azure tools, container tools, build tools, database tools,
191AIE503T CLOUD COMPUTING UNIT - IV
and many more. On the other hand, Cloud Shell also works with a number of supportive programming
languages. The most popular ones include Python, .NET, and Node.js.
Azure Drive
Cloud Shell in PowerShell begins in the Azure Drive. This enables you to navigate through the entire range
of Azure resources including Storage, Network, and Compute among the rest. The process of discovery and
navigation are similar to filesystem navigation. However, the drive really doesn't matter as you can still
manage the resources using Azure PowerShell cmdlets. Whatever changes you make to the Azure resources
will be reflected in the drive right away. To refresh the resources, run dir-Force.
Configured and Authenticated Azure Workstation
Naturally, one cannot deny the security and authentication of Cloud Shell as it works under the most reliable
name, Microsoft. In fact, Microsoft manages the Cloud Shell and ensures popular language support and
command-line tools as mentioned earlier. Cloud Shell is also responsible for securely authenticating the
instant and automatic access to the resources using Azure CLI.
Seamless Deployment
One of the latest updates of Cloud Shell is the graphical text editor. The feature is integrated based on the
open-source called the Monaco Editor. The feature enables you to create and customize files by running
code. This helps with seamless and smooth deployment through Azure PowerShell or Azure CLI 2.0.
As far as the pricing is concerned the Cloud Shell machine hosting services are free. These services are a
pre-requisite of a mounted Azure Files share. However, to access all the features and to utilize the storage,
the regular cost may apply. The best way to get the hang of it and to use it for maximum benefits, it is best
to get Azure training and Azure certification for a more detailed understanding of Azure Cloud Shell.
Azure Mobile Apps
Azure Mobile Apps (also known as the Microsoft Data sync Framework) gives enterprise developers and
system integrators a mobile-application development platform that's highly scalable and globally
available. The framework provides your mobile app with:
 Authentication
 Data query
 Offline data synchronization
191AIE503T CLOUD COMPUTING UNIT - IV
Azure Mobile Apps is designed to work with Azure App Service. Since it's based on ASP.NET 6, it can
also be run as a container in Azure Container Apps or Azure Kubernetes Service.
Why Mobile Apps?
With the Mobile Apps SDKs, you can:
 Build native and cross-platform apps: Build cloud-enabled apps for Android™, iOS, or Windows
using native SDKs.
 Connect to your enterprise systems: Authenticate your users with Azure Active Directory, and
connect to enterprise data stores.
 Build offline-ready apps with data sync: Make your mobile workforce more productive by building
apps that work offline. Use Azure Mobile Apps to sync data in the background.
Azure Mobile Apps features
The following features are important to cloud-enabled mobile development:
 Authentication and authorization: Use Azure Mobile Apps to sign-in users using social and
enterprise provides. Azure App Service supports Azure Active Directory, Facebook™, Google®,
Microsoft, Twitter®, and OpenID Connect®. Azure Mobile Apps supports any authentication
scheme that is supported by ASP.NET Core.
 Data access: Azure Mobile Apps provides a mobile-friendly OData v4 data source that's linked to a
compatible database via Entity Framework Core. Any compatible database can be used including
Azure SQL, Azure Cosmos DB, or an on-premises Microsoft SQL Server.
191AIE503T CLOUD COMPUTING UNIT - IV
 Offline sync: Build robust and responsive mobile applications that operate with an offline dataset.
You can sync this dataset automatically with service, and handle conflicts with ease.
 Client SDKs: There's a complete set of client SDKs that cover cross-platform development (.NET,
and Apache Cordova™). Each client SDK is available with an MIT license and is open-source.
Azure App Service features
The following platform features are useful for mobile production sites:
 Autoscaling: With App Service, you can quickly scale up or scale out to handle any incoming
customer load. Manually select the number and size of VMs, or set up autoscaling to scale your
service based on load or schedule.
 Staging environments: App Service can run multiple versions of your site. You can perform A/B
testing and do in-place staging of a new mobile service.
 Continuous deployment: App Service can integrate with common source control
management (SCM) systems, allowing you to easily deploy a new version of your mobile service.
 Virtual networking: App Service can connect to on-premises resources by using virtual network,
Azure ExpressRoute, or hybrid connections.
 Isolated and dedicated environments: For securely running Azure App Service apps, you can run
App Service in a fully isolated and dedicated environment. This environment is ideal for application
workloads that require high scale, isolation, or secure network access.

More Related Content

Similar to UNIT -IV.docx

Best-Practices-for-Using-Tableau-With-Snowflake.pdf
Best-Practices-for-Using-Tableau-With-Snowflake.pdfBest-Practices-for-Using-Tableau-With-Snowflake.pdf
Best-Practices-for-Using-Tableau-With-Snowflake.pdfssuserf8f9b2
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfan
 
Data Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptxData Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptxArunPandiyan890855
 
Mini project on microsoft azure based on time
Mini project on microsoft azure based on timeMini project on microsoft azure based on time
Mini project on microsoft azure based on timeLawalMuhd2
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)James Serra
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on AzureTrivadis
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Nathan Bijnens
 
Azure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptxAzure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptxshaikmadarbi3zen
 
Azure Data Engineering Course in Hyderabad
Azure Data Engineering  Course in HyderabadAzure Data Engineering  Course in Hyderabad
Azure Data Engineering Course in Hyderabadsowmyavibhin
 
"Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad ""Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad "madhupriya3zen
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Trivadis
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure DatabricksJames Serra
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekMark Kromer
 
Azure satpn19 time series analytics with azure adx
Azure satpn19   time series analytics with azure adxAzure satpn19   time series analytics with azure adx
Azure satpn19 time series analytics with azure adxRiccardo Zamana
 
Azure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandyAzure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandyNilesh Shah
 

Similar to UNIT -IV.docx (20)

Azure Data Engineering.pdf
Azure Data Engineering.pdfAzure Data Engineering.pdf
Azure Data Engineering.pdf
 
Best-Practices-for-Using-Tableau-With-Snowflake.pdf
Best-Practices-for-Using-Tableau-With-Snowflake.pdfBest-Practices-for-Using-Tableau-With-Snowflake.pdf
Best-Practices-for-Using-Tableau-With-Snowflake.pdf
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -
 
Data Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptxData Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptx
 
Mini project on microsoft azure based on time
Mini project on microsoft azure based on timeMini project on microsoft azure based on time
Mini project on microsoft azure based on time
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
 
Azure Data Engineering.pptx
Azure Data Engineering.pptxAzure Data Engineering.pptx
Azure Data Engineering.pptx
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
 
Azure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptxAzure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptx
 
Azure Data Engineering Course in Hyderabad
Azure Data Engineering  Course in HyderabadAzure Data Engineering  Course in Hyderabad
Azure Data Engineering Course in Hyderabad
 
"Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad ""Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad "
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 
Azure Synapse Analytics
Azure Synapse AnalyticsAzure Synapse Analytics
Azure Synapse Analytics
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data Week
 
Azure satpn19 time series analytics with azure adx
Azure satpn19   time series analytics with azure adxAzure satpn19   time series analytics with azure adx
Azure satpn19 time series analytics with azure adx
 
Azure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandyAzure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandy
 

More from Revathiparamanathan (19)

UNIT 1 NOTES.docx
UNIT 1 NOTES.docxUNIT 1 NOTES.docx
UNIT 1 NOTES.docx
 
Unit 3,4.docx
Unit 3,4.docxUnit 3,4.docx
Unit 3,4.docx
 
UNIT II.docx
UNIT II.docxUNIT II.docx
UNIT II.docx
 
UNIT V.docx
UNIT V.docxUNIT V.docx
UNIT V.docx
 
COMPILER DESIGN.docx
COMPILER DESIGN.docxCOMPILER DESIGN.docx
COMPILER DESIGN.docx
 
UNIT -III.docx
UNIT -III.docxUNIT -III.docx
UNIT -III.docx
 
UNIT - II.docx
UNIT - II.docxUNIT - II.docx
UNIT - II.docx
 
UNIT -V.docx
UNIT -V.docxUNIT -V.docx
UNIT -V.docx
 
UNIT - I.docx
UNIT - I.docxUNIT - I.docx
UNIT - I.docx
 
CC -Unit3.pptx
CC -Unit3.pptxCC -Unit3.pptx
CC -Unit3.pptx
 
CC.pptx
CC.pptxCC.pptx
CC.pptx
 
Unit 4 notes.pdf
Unit 4 notes.pdfUnit 4 notes.pdf
Unit 4 notes.pdf
 
Unit 3 notes.pdf
Unit 3 notes.pdfUnit 3 notes.pdf
Unit 3 notes.pdf
 
Unit 1 notes.pdf
Unit 1 notes.pdfUnit 1 notes.pdf
Unit 1 notes.pdf
 
Unit 2 notes.pdf
Unit 2 notes.pdfUnit 2 notes.pdf
Unit 2 notes.pdf
 
Unit 5 notes.pdf
Unit 5 notes.pdfUnit 5 notes.pdf
Unit 5 notes.pdf
 
CC.pptx
CC.pptxCC.pptx
CC.pptx
 
Unit-4 Day1.pptx
Unit-4 Day1.pptxUnit-4 Day1.pptx
Unit-4 Day1.pptx
 
Scala Introduction.pptx
Scala Introduction.pptxScala Introduction.pptx
Scala Introduction.pptx
 

Recently uploaded

main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 

Recently uploaded (20)

main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 

UNIT -IV.docx

  • 1. 191AIE503T CLOUD COMPUTING UNIT - IV UNIT –IV AZURE CLOUD AND CORE SERVICES Azure Synapse Analytics - HDInsight-Azure Data bricks - Usage of Internet of Things (IoT) Hub-IoT Central-Azure Sphere-Azure Cloud shell and Mobile Apps Azure Synapse Analytics Introduction In the mid of 2016, Azure made Azure SQL Data Warehouse service generally available for data warehousing on the cloud. Since then, this service has gone through several iterations, and towards the end of 2019, Microsoft announced that the Azure SQL Data Warehouse service would be rebranded as Azure Synapse Analytics. This service is the de-facto service for combining data warehousing and big data analytics, with many new features of the service in preview as well. High-Level Architecture Online Transaction Processing Workloads (OLTP) typically involve transactional data that is voluminous in terms of high reads and writes. The data access pattern usually involves a lot of scalar and tabular datasets. And data ingestion generally happens through user transactions in small batches of rows. Online Analytical Processing (OLAP) applications typically store and process large volumes of data collected from various sources, which may be transformed and/or modeled in the OLAP repository, and then large datasets are aggregated for ad-hoc reporting and analytical use-cases. The latter is the use-case where Synapse Analytics fits in the overall data landscape, as shown below. Azure Data Lake Storage forms the bedrock of big data storage, and Power BI forms the visualization layer, as shown below.
  • 2. 191AIE503T CLOUD COMPUTING UNIT - IV Azure Synapse Components and Features There are multiple components of Synapse Analytics architecture on Azure. Let’s understand all these components one by one.  Synapse Analytics is basically an analytics service that has a virtually unlimited scale to support analytics workloads  Synapse Workspaces (in preview as of Sept 2020) provides an integrated console to administer and operate different components and services of Azure Synapse Analytics
  • 3. 191AIE503T CLOUD COMPUTING UNIT - IV  Synapse Analytics Studio is a web-based IDE to enable code-free or low-code developer experience to work with Synapse Analytics  Synapse supports a number of languages like SQL, Python, .NET, Java, Scala, and R that are typically used by analytic workloads  Synapse supports two types of analytics runtimes – SQL and Spark (in preview as of Sept 2020) based that can process data in a batch, streaming, and interactive manner  Synapse is integrated with numerous Azure data services as well, for example, Azure Data Catalog, Azure Lake Storage, Azure Databricks, Azure HDInsight, Azure Machine Learning, and Power BI  Synapse also provides integrated management, security, and monitoring related services to support monitoring and operations on the data and services supported by Synapse  Data Lake Storage is suited for big data scale of data volumes that are modeled in a data lake model. This storage layer acts as the data source layer for Synapse. Data is typically populated in Synapse from Data Lake Storage for various analytical purposes Now that we understand different layers or components of the architecture let’s understand the core pillars of Synapse.  Azure Synapse Studio – This tool is a web-based SaaS tool that provides developers to work with every aspect of Synapse Analytics from a single console. In an analytical solution development life- cycle using Synapse, one generally starts with creating a workspace and launching this tool that provides access to different synapse features like Ingesting data using import mechanisms or data pipelines and create data flows, explore data using notebooks, analyze data with spark jobs or SQL scripts, and finally visualize data for reporting and dash boarding purposes. This tool also provides
  • 4. 191AIE503T CLOUD COMPUTING UNIT - IV features for authoring artifacts, debugging code, optimizing performance by assessing metrics, integration with CI/CD tools, etc.  Azure Synapse Data Integration – There are different tools that can be used to load data into Synapse. But having an integrated orchestration engine help to reduce dependency and management of separate tool instances and data pipelines. This service comes with an integrated orchestration engine that is identical to Azure Data Factory to create data pipelines and rich data transformation capabilities within the Synapse workspace itself. Key features include support for 90+ data sources that include almost 15 Azure-based data sources, 26 open-source and cross-cloud data warehouses and databases, 6 file-based data sources, 3 No SQL based data sources, 28 Services and Apps that can serve as data providers, as well as 4 generic protocols like ODBC, REST, etc. that can serve data. Pipelines can be created using built-in templates from Synapse Studio to integrate data from various sources, as shown below.
  • 5. 191AIE503T CLOUD COMPUTING UNIT - IV  Synapse SQL Pools – This feature provides the same data warehousing features that were made available with the earlier versions of this service when it was branded as SQL DW. This feature of the service available in a provisioned manner where a fixed capacity of DWU units is allocated to the instance of the service for data processing. Data can be imported into Synapse using different mechanisms like SSIS, Polybase, Azure Data Factory, etc. Synapse stores data in a columnar format and enables distributed querying capabilities, which is better suited for the performance of OLAP workloads. SQL Pools have built-in support for data streaming, as well as few AI functions out-of- box. Shown below is a screenshot of how Synapse SQL Pool would look. Generally, Synapse SQL Pools are part of an Azure SQL Server instance and can be browsed using tools like SSMS as well. Synapse SQL feature is also available in a serverless manner (in preview as of Sept 2020), where no fixed capacity of the infrastructure needs to be provisioned. Instead, Azure manages the required infrastructure capacity to meet the needs of the workloads. This is a data virtualization feature supported by Synapse SQL. The pricing model, in this case, is based on the data volumes processed instead of the number of DWUs allocated to the instance.
  • 6. 191AIE503T CLOUD COMPUTING UNIT - IV  Apache Spark for Azure Synapse – This component of Synapse provides Spark runtime to perform the same set of tasks like data loading, data processing, data preparation, ETLs, and other tasks that are generally related to data warehousing. Azure provides Data Bricks, too, as a service that is based on Spark runtime with a certain set of optimizations, which is typically used for a similar set of purposes. One of the advantages of this feature compared to Azure Databricks is that no additional or separate clusters need to be managed to process data as this is an integral part of Synapse, provides Spark-based processing with auto-scaling, support for features like .NET for Spark, SparkML algorithms, Delta Lake, Azure ML Integration for Apache Spark, Jupyter style notebooks, etc. In addition, it has multi-language support for languages like C#, Pyspark, Scala, Spark SQL, Java, etc. Once a Synapse workspace is created, one can provision Apache Spark pools or Synapse SQL pools from a common interface, as shown below.
  • 7. 191AIE503T CLOUD COMPUTING UNIT - IV  Azure Synapse Security – Apart from the above features, one key aspect to take note of is the array of security features packed in Azure Synapse. It is already compliant with almost 30 industry- leading compliances like ISO, SOC, FedRAMP, DISA, HIPAA, FIPS, etc. o It supports Azure AD authentication, SQL based authentication as well as Multi-factor authentication o It supports data encryption at rest and in transit as well as data classification for sensitive data o It supports row-level, column-level, as well as object-level security along with dynamic data masking o It supports network-level security with virtual networks as well as firewalls Azure Synapse is a tightly integrated suite of services that cover the entire spectrum of tasks and processes that are used in the workflow of an analytical solution. These architectural components provide a modular vision of the entire suite to get a head start. Azure Synapse Analytics Features  Centralized Data Management: Azure Synapse utilizes Massively Parallel Processing (MPP) technology. It allows processing and management of large workloads and efficiently handling large data volumes. Deliver unified experience by managing both data lakes as well as data warehouses.  Workload Isolation: This capability allows users to manage the execution of heterogeneous workloads. It offers increased flexibility by exclusively reserving resources for a specific workload group. This is while having complete control over warehouse resources to satisfy business SLAs.  Machine Learning Integration: By integrating Azure Machine Learning, Azure Synapse Analytics enables leveraging ML capabilities. This can help predict and score the ML models to generate
  • 8. 191AIE503T CLOUD COMPUTING UNIT - IV predictions within the data warehouse itself. Further, it allows converting the existing & trained ML models into Synapse Analytics itself as opposed to recreating the entire model again. This helps businesses to save time, money and effort. Further, businesses can also analyze the data using machine learning algorithms and visualize the result over a rich PowerBI Dashboard, how about that? Hence it is a great tool for companies to manage real- time analytics for:  Supply chain forecasting  Inventory reporting  Predictive maintenance  Anomaly detection Azure Synapse Analytics Benefits Businesses of today use a variety of tools to manage, store and analyze workloads. And, you will agree, things can go wrong when one of the interconnected systems faces a downtime or any other technical challenge. Azure Synapse Analytics offers businesses centralized management of the data lakes and data warehouses. Interestingly, Azure Synapse Studio offers a unified workspace for data preparation, data management, data warehousing, Big Data & Artificial Intelligence tasks. Here are some of the salient benefits; Accelerate Analytics & Reporting  Reduced manual efforts for collecting, collating and building reports  Instant scalability & flexibility leads to no downtimes with workload variations  Faster DWH deployment Better BI & Data Visualization  The seamless and native integration with Power BI makes the reporting & analysis of key metrics that are engaging, easy-to-use. This makes it even easier to share with relevant stakeholders across business streams – a word of advice: fewer data silos leads to more visibility. Increased IT productivity  Enables staff to automate infrastructure provisioning and administrative tasks (includes DWH setup, patch management & maintenance)
  • 9. 191AIE503T CLOUD COMPUTING UNIT - IV Limitless Scaling  Being a cloud-based service, Azure Synapse Analytics can view, organize and queries (relational & non-relational data) faster than traditional on-premises tools. In other words, you can efficiently manage thousands of concurrent users and systems. And, when compared with Google’s BigQuery, Synapse could run the same query in roughly 75% less time over a petabyte of data. Azure HDInsight Azure HDInsight is a service offered by Microsoft, that enables us to use open source frameworks for big data analytics. Azure HDInsight allows the use of frameworks like Hadoop, Apache Spark, Apache Hive, LLAP, Apache Kafka, Apache Storm, R, etc., for processing large volumes of data. These tools can be used on data to perform extract, transform, and load (ETL,) data warehousing, machine learning, and IoT. Azure HDInsight Features The main features of Azure HDInsight that set it apart are:  Cloud and on-premises availability: Azure HDInsight can help us in big data analytics using Hadoop, Spark, interactive query (LLAP,) Kafka, Storm, etc., on the cloud as well as on-premises.  Scalable and economical: HDInsight can be scaled up or down as and when required. The ability to be scaled also means that you have to pay for only what you use. You can upgrade your HDInsight when required, and this eliminates having to pay for unused resources.  Security: Azure HDInsight protects your assets with industry-standard security. The encryption and integration with Active Directory make sure that your assets are safe in the Azure Virtual Network.  Monitoring and analytics: HDInsight’s integration with Azure Monitor helps us to closely watch what is happening in our clusters and take actions based on that.  Global availability: Azure HDInsight is more globally available than any other big data analytics service.  Highly productive: Productive tools for Hadoop and Spark can be used in HDInsight in different development environments like Visual Studio, VSCode, Eclipse, and IntelliJ for Scala, Python, R, Java, etc.
  • 10. 191AIE503T CLOUD COMPUTING UNIT - IV Azure HDInsight Architecture Before getting into the uses of Azure HDInsight, let’s understand how to choose the right Architecture for Azure HDInsight. Listed below are best practices for Azure HDInsight Architecture:  It is recommended that you migrate an on-premises Hadoop cluster to Azure HDInsight using multiple workload clusters rather than a single cluster. A large number of clusters will increase your costs unnecessarily if used over time.  On-demand transient clusters are used so that the clusters are deleted after the workload is complete. As a result, resource costs may be reduced since HDInsight clusters are rarely used. By deleting a cluster, you will not be deleting the associated meta-stores or storage accounts, so you can use them to recreate the cluster if necessary.  In HDInsight clusters, as storage-and-compute can be used from Azure Storage, Azure Data Lake Storage, or both, it is best to separate data storage from processing. In addition to reducing storage costs, it will also allow you to use transient clusters, share data, scale storage, and compute independently. Azure HDInsight Metastore Best Practices The Apache Hive Metastore is an important aspect of the Apache Hadoop architecture since it serves as a central schema repository for other big data access resources including Apache Spark, Interactive Query (LLAP), Presto, and Apache Pig. It is worth noting that HDInsight uses Azure SQL as its Hive metastore database. There are two types when it comes to HDInsight metastores: default metastores or custom metastores.  A default metastore can be created for free for any cluster type, but if one is created it cannot be shared.  The use of custom metastores is recommended for production clusters since they can be created and removed without loss of metadata. It is suggested to use a custom metastore to isolate compute and metadata and to periodically back it up. HDInsight immediately deletes the Hive metastore upon cluster destruction. By storing Hive metastore in Azure DB, you will not have to remove it when deleting the cluster.
  • 11. 191AIE503T CLOUD COMPUTING UNIT - IV Azure Log Analysis and Azure Portal provide monitoring tools for monitoring metadata store performance. If you are using HDInsight in the same region as your metastore, make sure that they are in the same location. Azure HDInsight Migration The following are best practices for Azure HDInsight migration: Script migration or replication can be used to migrate Hive metastore. You can migrate Hive metastore with scripts by creating Hive DDLs from the existing metastore, editing the generated DDL to replace HDFS URLs with WASB/ADLS/ABFS URLs, and then running the modified DDL on the metastore. Both the on-premises and cloud versions of the metastore need to be compatible. Migration Using DB Replication: When migrating your Hive metastores using DB replication, you can use the Hive MetaTool to replace HDFS URLs with WASB/ADLS/ABFS URLs. Here’s an example code: ./hive --service metatool -updateLocation hdfs://nn1:8020/ wasb://@.blob.core.windows.net/ Azure offers two approaches for migrating data from on-premises: migrating offline or migrating over TLS. It will probably depend on how much data you need to migrate to determine the best choice for you. Migrating over TLS: Microsoft Azure Storage Explorer, Azure Copy, Azure Powershell, and Azure CLI can be used to migrate data over TLS to Azure storage. Migrating offline: DataBox, DataBox Disk, and Data Box Heavy devices are also available for the offline shipment of large amounts of data to Azure. As an alternative, you can also use native tools such as Apache Hadoop DistCp, Azure Data Factory, or AzureCp to transfer data over the network. Azure HDInsight Security and DevOps To protect and maintain the cluster, it is wise to use Enterprise Security Package (ESP), which provides directory-based authentication, multi-user assistance, and role-based access control. The ESP framework can be used with a range of clusters, including Apache Hadoop, Apache Spark, Apache Hbase, Apache Kafka, and Interactive Query (Hive LLAP). To ensure your HDInsight deployment is secure, you need to take the following steps:
  • 12. 191AIE503T CLOUD COMPUTING UNIT - IV Azure Monitor: Use the Azure Monitor service for monitoring and alerting. Stay on top of updates: Always upgrade HDInsight to the latest version, install OS patches, and reboot your nodes. Enforce end-to-end enterprise security, with features such as auditing, encryption, authentication, authorization, and a private pipeline. Azure Storage Keys should also be encrypted. By using Shared Access Signatures (SAS), you can limit access to your Azure storage resources. Azure Storage automatically encrypts data written to it using Storage Service Encryption (SSE) and replication. Make sure to update HDInsight at regular intervals. In order to do this, you can follow the steps outlined below:  Set up a new HDInsight cluster and apply the most recent update to HDInsight.  Ensure the current cluster has enough workers and workloads.  As needed, change applications, or workloads.  A backup should be made of all temporary data stored on cluster nodes.  Delete the existing cluster.  Install HDInsight on a fresh new cluster with the same default data and metastore as previously.  Import any temporary file backups.  Finish processing jobs with the new cluster or start new ones. Azure HDInsight Uses The main scenarios in which we can use Azure HDInsight are: Data Warehousing Data warehousing is the storage of large volumes of data for retrieval and analysis at any point in time. Data warehouses are maintained by businesses to analyze them and make strategic decisions based on them. HDInsight can be used for data warehousing by performing queries at very large scales on structured or unstructured data.
  • 13. 191AIE503T CLOUD COMPUTING UNIT - IV Internet of Things (IoT) We are surrounded by a large number of smart devices that make our life easier. These IoT-enabled devices help us in taking off the task of making small decisions regarding our devices. IoT requires the processing and analytics of data coming in from millions of smart devices. This data is the backbone of IoT and maintaining and processing it is vital for the proper functioning of IoT-enabled devices. Azure HDInsight can help in processing large volumes of data coming from numerous devices.
  • 14. 191AIE503T CLOUD COMPUTING UNIT - IV Data Science Building applications that can analyze data and do tasks based on it are vital for AI-enabled solutions. These apps need to be powerful enough to process large volumes of data and make decisions based on that. An example worth noting would be the software used in self-driving cars. This software has to constantly keep on learning from new experiences as well as from historical data to make real-time decisions. Azure HDInsight helps in making applications that can extract vital information from analyzing large volumes of data. Preparing for job interviews? Have a look at our blog on Azure interview questions and answers! Hybrid Cloud A hybrid cloud is when companies use both public and private cloud for their workflows. In this, they will get the benefits of both such as security, scalability, flexibility, etc. Azure HDInsight can be used to extend a company’s on-premises infrastructure to the cloud for better analytics and processing in a hybrid situation.
  • 15. 191AIE503T CLOUD COMPUTING UNIT - IV Azure Data bricks Databricks Introduction (https://intellipaat.com/blog/what-is-azure-databricks/#no9) Databricks is a software company founded by the creators of Apache Spark. The company has also created famous software such as Delta Lake, MLflow, and Koalas. These are the popular open- source projects that span data engineering, data science, and machine learning. Databricks develops web-based platforms for working with Spark, which provides automated cluster management and IPython-style notebooks. Databricks in Azure
  • 16. 191AIE503T CLOUD COMPUTING UNIT - IV Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services platform. Azure Databricks offers three environments:  Databricks SQL  Databricks data science and engineering  Databricks machine learning Databricks SQL Databricks SQL provides a user-friendly platform. This helps analysts, who work on SQL queries, to run queries on Azure Data Lake, create multiple virtualizations, and build and share dashboards. Databricks Data Science and Engineering Databricks data science and engineering provide an interactive working environment for data engineers, data scientists, and machine learning engineers. The two ways to send data through the big data pipeline are:  Ingest into Azure through Azure Data Factory in batches  Stream real-time by using Apache Kafka, Event Hubs, or IoT Hub
  • 17. 191AIE503T CLOUD COMPUTING UNIT - IV Databricks Machine Learning Databricks machine learning is a complete machine learning environment. It helps to manage services for experiment tracking, model training, feature development, and management. It also does model serving. Enroll in our Azure training in Bangalore, if you are interested in getting an AZ-400 certification. Pros and Cons of Azure Databricks Moving ahead in this blog, we will discuss the pros and cons of Azure Databricks and understand how good it really is. Pros  It can process large amounts of data with Databricks and since it is part of Azure; the data is cloud- native.  The clusters are easy to set up and configure.  It has an Azure Synapse Analytics connector as well as the ability to connect to Azure DB.  It is integrated with Active Directory.  It supports multiple languages. Scala is the main language, but it also works well with Python, SQL, and R. Cons  It does not integrate with Git or any other versioning tool.  It, currently, only supports HDInsight and not Azure Batch or AZTK. Databricks SQL Databricks SQL allows you to run quick ad-hoc SQL queries on Data Lake. Integrating with Azure Active Directory enables to run of complete Azure-based solutions by using Databricks SQL. By integrating with Azure databases, Databricks SQL can store Synapse Analytics, Azure Cosmos DB, Data Lake Store, and Blob Storage. Integrating with Power BI, Databricks SQL allows users to discover and share insights more easily. BI tools, such as Tableau Software, can also be used for accessing data bricks.
  • 18. 191AIE503T CLOUD COMPUTING UNIT - IV The interface that allows the automation of Databricks SQL objects is REST API. Data Management It has three parts:  Visualization: A graphical presentation of the result of running a query  Dashboard: A presentation of query visualizations and commentary  Alert: A notification that a field returned by a query has reached a threshold Computation Management Here, we will know about the terms that will help to run SQL queries in Databricks SQL.  Query: A valid SQL statement  SQL endpoint: A resource where SQL queries are executed  Query history: A list of previously executed queries and their characteristics Opt for Microsoft Azure Training taught by industry experts and get certified! Authorization  User and group: The user is an individual who has access to the system. The set of multiple users is known as a group.  Personal access token: An opaque string is used to authenticate to the REST API.  Access control list: Set of permissions attached to a principal that requires access to an object. ACL (Access Control List) specifies the object and actions allowed in it. Databricks Data Science & Engineering Databricks Data Science & Engineering is, sometimes, also called Workspace. It is an analytics platform that is based on Apache Spark.
  • 19. 191AIE503T CLOUD COMPUTING UNIT - IV Databricks Data Science & Engineering comprises complete open-source Apache Spark cluster technologies and capabilities. Spark in Databricks Data Science & Engineering includes the following components:  Spark SQL and DataFrames: This is the Spark module for working with structured data. A DataFrame is a distributed collection of data that is organized into named columns. It is very similar to a table in a relational database or a data frame in R or Python.  Streaming: This integrates with HDFS, Flume, and Kafka. Streaming is real-time data processing and analysis for analytical and interactive applications.  MLlib: It is short for Machine Learning Library consisting of common learning algorithms and utilities including classification, regression, clustering, collaborative filtering, dimensionality reduction as well as underlying optimization primitives.  GraphX: Graphs and graph computation for a broad scope of use cases from cognitive analytics to data exploration.  Spark Core API: This has the support for R, SQL, Python, Scala, and Java. Integrating with Azure Active Directory enables you to run complete Azure-based solutions by using Databricks SQL. By integrating with Azure databases, Databricks SQL can store Synapse Analytics, Cosmos DB, Data Lake Store, and Blob Storage. By integrating with Power BI, Databricks SQL allows users to discover and share insights more easily. BI tools, such as Tableau Software, can also be used. Workspace Workspace is the place for accessing all Azure Databricks assets. It organizes objects into folders and provides access to data objects and computational resources. The workspace contains:  Dashboard: It provides access to visualizations.  Library: Package available to notebook or job running on the cluster. We can also add our own libraries.  Repo: A folder whose contents are co-versioned together by syncing them to a local Git repository.  Experiment: A collection of MLflow runs for training an ML model.
  • 20. 191AIE503T CLOUD COMPUTING UNIT - IV Interface It supports UI, API, and command line (CLI.)  UI: It provides a user-friendly interface to workspace folders and their resources.  Rest API: There are two versions, REST API 2.0 and REST API 1.2. REST API 2.0 has features of REST API 1.2 along with some additional features. So, REST API 2.0 is the preferred version.  CLI: It is an open-source project that is available on GitHub. CLI is built on REST API 2.0. Data Management  Databricks File System (DBFS): It is an abstraction layer over the Blob store. It contains directories that can contain files or more directories.  Database: It is a collection of information that can be managed and updated.  Table: Tables can be queried with Apache Spark SQL and Apache Spark APIs.  Metastore: It stores information about various tables and partitions in the data warehouse. To learn more, have a look at our blog on Azure tutorial now! Computation Management To run computations in Azure Databricks, we need to know about the following:  Cluster: It is a set of computation resources and configurations on which we can run notebooks and jobs. These are of two types: o All-purpose: We create an all-purpose cluster by using UI, CLI, or REST API. We can manually terminate and restart an all-purpose cluster. Multiple users can share such clusters to do collaborative, interactive analysis. o Job: The Azure Databricks job scheduler creates a job cluster when we run a job on a new job cluster and terminates the cluster when the job is complete. We cannot restart a job cluster.  Pool: It has a set of ready-to-use instances that reduce cluster start. It also reduces auto-scaling time. If the pool does not have enough resources, it expands itself. When the attached cluster is terminated, the instances it uses are returned to the pool and can be reused by a different cluster.
  • 21. 191AIE503T CLOUD COMPUTING UNIT - IV Databricks Runtime The core components that run on clusters managed by Azure Databricks offer several runtimes:  It includes Apache Spark but also adds numerous other features to improve big data analytics.  Databricks Runtime for machine learning is built on Databricks runtime and provides a ready environment for machine learning and data science.  Databricks Runtime for genomics is a version of Databricks runtime that is optimized for working with genomic and biomedical data.  Databricks Light is the Azure Databricks packaging of the open-source Apache Spark runtime. Job  Workload: There are two types of workloads with respect to the pricing schemes: o Data engineering workload: This workload works on a job cluster. o Data analytics workload: This workload runs on an all-purpose cluster.  Execution context: It is the state of a REPL environment. It supports Python, R, Scala, and SQL. Model Management The concepts that are needed to know how to build machine learning models are:  Model: This is a mathematical function that represents the relation between inputs and outputs. Machine learning consists of training and inference steps. We can train a model by using an existing data set and using that to predict the outcomes of new data.  Run: It is a collection of parameters, metrics, and tags that are related to training a machine learning model.  Experiment: It is the primary unit of organization and access control for runs. All MLflow runs belong to the experiment. Authentication and Authorization  User and group: A user is an individual who has access to the system. A set of users is a group.  Access control list: Access control list (ACL) is a set of permissions that are attached to a principal, which requires access to an object. ACL specifies the object and the actions allowed on it. Look at Azure Interview Questions and take a bigger step toward building your career.
  • 22. 191AIE503T CLOUD COMPUTING UNIT - IV Databricks Machine Learning Databricks machine learning is an integrated end-to-end machine learning platform incorporating managed services for experiment tracking, model training, feature development and management, and feature and model serving. Databricks machine learning automates the creation of a cluster that is optimized for machine learning. Databricks Runtime ML clusters include the most popular machine learning libraries such as TensorFlow, PyTorch, Keras, and XGBoost. It also includes libraries, such as Horovod, that are required for distributed training. With Databricks machine learning, we can:  Train models either manually or with AutoML  Track training parameters and models by using experiments with MLflow tracking  Create feature tables and access them for model training and inference  Share, manage, and serve models by using Model Registry We also have access to all of the capabilities of Azure Databricks workspace such as notebooks, clusters, jobs, data, Delta tables, security and admin controls, and many more. When to use Databricks 1. Modernize your Data Lake – if you are facing challenges around performance and reliability in your data lake, or your data lake has become a data swamp, consider Delta as an option to modernize your Data Lake. 2. Production Machine Learning – if your organization is doing data science work but is having trouble getting that work into the hands of business users, the Databricks platform was built to enable data scientists from getting their work from Development to Production. 3. Big Data ETL – from a cost/performance perspective, Databricks is best in its class. 4. Opening your Data Lake to BI users – If your analyst / BI group is consistently slowed down by the major lift of the engineering team having to build a pipeline every time they want to access new data, in might make sense to open the Data Lake to these users through a tool like SQL Analytics within Databricks. When not to use Databricks There are a few scenarios when using Databricks is probably not the best fit for your use case: 1. Sub-second queries – Spark, being a distributed engine, has overhead involved in processing that make it nearly impossible to get sub-second queries. Your data can still live in the data lake, but for sub-second queries you will likely want to use a highly tuned speed layer. 2. Small data – Similar to the first point, you won't get the majority of the benefits of Databricks if you are dealing with very small data (think GBs).
  • 23. 191AIE503T CLOUD COMPUTING UNIT - IV 3. Pure BI without a supporting data engineering team – Databricks and SQL Analytics does not erase the need for a data engineering team – in fact, they are more critical than ever in unlocking the potential of the Data Lake. That said, Databricks offers tools to enable the data engineering team itself. 4. Teams requiring drag and drop ETL – Databricks has many UI components but drag and drop code is not currently one of them. Usage of Internet of Things (IoT) Hub Azure IoT hub allows you to get on with developing cool IoT stuff, and not worry about how it all gets connected up and managed. Internet of Things (IoT) offers businesses immediate and real-world opportunities to reduce costs, to increase revenue, as well as transforming their businesses. Azure IoT hub is a managed IoT service which is hosted in the cloud. It allows bi-directional communication between IoT applications and the devices it manages. This cloud-to-device connectivity means that you can receive data from your devices, but you can also send commands and policies back to the devices. How Azure IoT hub differs from the existing solutions is that it also provides the infrastructure to authenticate, connect and manage the devices connected to it.
  • 24. 191AIE503T CLOUD COMPUTING UNIT - IV Azure IoT Hub allows full-featured and scalable IoT solutions. Virtually, any device can be connected to Azure IoT Hub and it can scale up to millions of devices. Events can be tracked and monitored, such as the creation, failure, and connection of devices. Azure IoT Hub provides,  Device libraries for the most commonly used platforms and languages for easy device connectivity.  Secure communications with multiple options for device-to-cloud and cloud-to-device hyper-scale communication.  Queryable storage of per-device state information as well as meta-data. Managing devices with IoT Hub The needs and requirements of IoT operators vary substantially in different industries, from transport to manufacturing to agriculture to utilities. There is also a wide variation in the types of devices used by IoT operators. IoT Hub is able to provide the capabilities, patterns and code libraries to allow developers to build management solutions that can manage very diverse sets of devices. Configuring and controlling devices Devices which are connected to IoT Hub can be managed using an array of built-in functionality. This means that-  Device metadata and state information for all your devices can be stored, synchronized and queried.  Device state can be set either per-device or in groups depending on common characteristics of the devices.  A state change in a device can be automatically responded to by using message routing integration. The lifecycle of devices with IoT Hub  Plan Operators can create a device metadata scheme that allows them to easily carry out bulk management
  • 25. 191AIE503T CLOUD COMPUTING UNIT - IV operations.  Provision New devices can be securely provisioned to IoT Hub and operators can quickly discover device capabilities. The IoT Hub identity registry is used to create device identities and credentials.  Configure Device management operations, such as configuration changes and firmware updates can be done in bulk or by direct methods, while still maintaining system security.  Monitor Operators can be easily alerted to any issues arising and at the same time the device collection health can be monitored, as well as the status of any ongoing operations.  Retire Devices need to be replaced, retired or decommissioned. The IoT Hub identity registry is used to withdraw device identities and credentials. Device management patterns IoT Hub supports a range of device management patterns including,  Reboot  Factory reset  Configuration  Firmware update  Reporting progress and status These patterns can be extended to fit your exact situation. Alternatively, new patterns can be designed based on these templates. Connecting your devices You can build applications which run on your devices and interact with IoT Hub using the Azure IoT device SDK. Windows, Linux distributions, and real-time operating systems are supported platforms. Supported languages currently include,  C  C#  Java  Python  Node.js. Messaging Patterns Azure IoT Hub supports a range of messaging patterns including,  Device to cloud telemetry  File upload from devices  Request-reply methods which enable devices to be controlled from the cloud Message routing and event grid
  • 26. 191AIE503T CLOUD COMPUTING UNIT - IV Both IoT Hub message routing and IoT Hub integration with Event Grid makes it possible to stream data from your connected devices. However, there are differences. Message routing allows users to route device- to-cloud messages to a range of supported service endpoints such as Event Hubs and Azure Storage containers while IoT Hub integration with Event Grid is a fully managed routing service which can be extended into third-party business applications. Device data can be routed In Azure IoT Hub, the message routing functionality is built in. This allows you to set up automatic rules- based message fan-out. You can use message routing to decide where your hub sends your devices’ telemetry. Routing messages to multiple endpoints don’t incur any extra costs. Building end-to-end solutions End-to-end solutions can be built by integrating IoT Hub with other Azure services. For example,  Business processes can be automated using Azure Logic Apps.  You can run analytic computations in real-time on the data from your devices using Azure Stream Analytics.  AI models and machine learning can be added using Azure Machine Learning.  You can respond rapidly to critical events with Azure Event Grid. Azure IoT Hub or Azure Event Hub? Both Azure IoT Hub and Azure Event Hub are cloud services which can ingest, process and store large amounts of data. However, they were designed with different purposes in mind. Event Hub was developed for big data streaming while IoT Hub was designed specifically to connect IoT devices at scale to the Azure Cloud. Therefore, which one you choose to use will depend on the demands of your business. Security Businesses face security, privacy, and compliance challenges which are unique to the IoT. Security for IoT solutions means that devices need to be securely provisioned and there needs to be secure connectivity between the devices and the cloud, as well as secure data protection in the cloud during processing and storage.
  • 27. 191AIE503T CLOUD COMPUTING UNIT - IV IoT Hub allows data to be sent on secure communications channels. Each device connects securely to the hub and each device can be managed securely. You can control access at the per-device level and devices are automatically provisioned to the correct hub when the device first boots up. There’s also a range of different types of authentication depending on device capabilities, including SAS SAS token-based authentication, individual X.509 certificate authentication for secure, standards-based authentication, as well as X.509 CA authentication. High Availability and Disaster Recovery Uptime goals vary from business to business. Azure IoT Hub offers three main High Availability (HA) and Disaster Recovery (DR) features including:  Intra-region HA The IoT Hub service provides intra-region HA by implementing redundancies in almost all layers of the service. The SLA published by the IoT Hub service is achieved by making use of these redundancies and are available automatically to developers. However, transient failures should be expected when using cloud computing; therefore, appropriate retry policies need to be built into components which interact with the cloud in order to deal with these transient failures.  Cross region DR Situations may arise when a datacentre suffers from extended outages or some other physical failure. It is rare but possible that intra-region HA capability may not be able to help in some of these situations. However, IoT Hub has a number of possible solutions for recovering from extended outages or physical failures. In these situations, a customer can have a Microsoft initiated failover or a manual failover.  Both of these options offer the following recovery time objectives (RTO),  Achieving cross region HA
  • 28. 191AIE503T CLOUD COMPUTING UNIT - IV  If the RTOs provided by either the Microsoft initiated failover or manual failover aren’t sufficient for your uptime goals, then another option is to implement a per-device automatic cross region failover mechanism. In this model, the IoT solution runs in a primary and secondary datacentre in two different locations. If there’s an outage or a loss of network connectivity in the primary region, the devices can use the secondary location.  Choosing the right IoT Hub tier  Azure IoT hub offers two tiers, basic and standard. The basic tier which is uni-directional from devices to the cloud is more suitable if the data is going to be gathered from devices and analyzed centrally. However, if you want bi-directional communication, enabling you to, for example, control devices remotely, then the standard tier is more appropriate. Both tiers have the same security and authentication features.  Each tier has three different sizes (1, 2 and 3), depending on how much data they can handle in a day. For instance, a level 3 unit can handle 300 million messages a day while a level 1 unit can handle 400,000. IoT Central: https://www.thethingsindustries.com/docs/integrations/cloud-integrations/azure-iot-central/device- templates/ IoT Central is an IoT application platform as a service (aPaaS) that reduces the burden and cost of developing, managing, and maintaining enterprise-grade IoT solutions. If you choose to build with IoT Central, you'll have the opportunity to focus time, money, and energy on transforming your business with IoT data, rather than just maintaining and updating a complex and continually evolving IoT infrastructure.
  • 29. 191AIE503T CLOUD COMPUTING UNIT - IV The web UI lets you quickly connect devices, monitor device conditions, create rules, and manage millions of devices and their data throughout their life cycle. Furthermore, it enables you to act on device insights by extending IoT intelligence into line-of-business applications. The key features of the Azure IoT Hub integration are:  Handling uplink messages: The Things Stack publishes uplink messages to an Azure IoT Central Application  Automatic device provisioning: end devices are automatically created into the Azure IoT Central Application, using the LoRaWAN device repository information in order to provision the end device template  Updating device state in Device Twin: update the device reported properties based on the decoded payloads, and schedule downlinks based on the device desired properties Architecture The Azure IoT Central integration does not require any additional physical resources in your Azure account. It connects to the Azure IoT Central Application using the underlying Azure IoT Device Provisioning Service, then submits traffic using the Azure IoT Hub in which the application has been provisioned.
  • 30. 191AIE503T CLOUD COMPUTING UNIT - IV The single resource deployed in your Azure Account is the Azure IoT Central Application. All permissions are the minimum permissions for the integration to function. Implementation details Azure IoT Hub is designed around standalone end devices communicating directly with the hub. Each end device must connect to the hub via one of the supported communication protocols (MQTT / AMQP). These protocols are inherently stateful - each individual end device must have one connection always open in order to send and receive messages from the Azure IoT Hub. LoRaWAN end devices are in general low power, low resources devices with distinct traffic patterns. Communication in the LoRaWAN world also does not have the concept of a connection, in the TCP sense, but instead focuses on a communication session. Downlink traffic, which would map to IoT Hub cloud-to- device messages, occurs rarely at application layer for most use cases. As such, keeping a connection open per end device is both wasteful and hard to scale, as both communication protocols mentioned above enforce that each end device has its own individual connection, and no subscription groups semantics are available. Based on the above arguments, the Azure IoT Central integration prefers to use an asynchronous, stateless communication style. When uplink messages are received from an end device, the integration connects on demand to the Azure IoT Hub and submits the message, and also updates the Device Twin. The data plane protocol used between The Things Stack and Azure IoT Hub is MQTT, and the connections are always secure using TLS 1.2.
  • 31. 191AIE503T CLOUD COMPUTING UNIT - IV Device Twin desired properties updates and device creation or deletion events are received by The Things Stack using an IoT Central Data Export. The Data Export submits the data via HTTP requests which are authenticated using the API key provided during the integration provisioning, and connections are always done over TLS. This pipeline allows The Things Stack to avoid long running connections to the Azure IoT Hub. Azure Sphere Microsoft’s website states that “Azure Sphere is a solution for creating highly secured, connected Microcontroller (MCU) devices” (source). But it is not just about MCU, of course. The solution also includes an operating system and an application platform. This provides product manufacturers with a chance to create secured, internet-connected devices that can be controlled, updated, monitored and maintained remotely. Azure Sphere is a secured, high-level application platform with built-in communication and security features for internet-connected devices. It comprises a secured, connected, crossover microcontroller unit (MCU), a custom high-level Linux-based operating system (OS), and a cloud-based security service that provides continuous, renewable security. The Azure Sphere MCU integrates real-time processing capabilities with the ability to run a high-level operating system. An Azure Sphere MCU, along with its operating system and application platform, enables the creation of secured, internet-connected devices that can be updated, controlled, monitored, and maintained remotely. A connected device that includes an Azure Sphere MCU, either alongside or in place of an existing MCUs, provides enhanced security, productivity, and opportunity. For example:  A secured application environment, authenticated connections, and opt-in use of peripherals minimizes security risks due to spoofing, rogue software, or denial-of-service attacks, among others.  Software updates can be automatically deployed from the cloud to any connected device to fix problems, provide new functionality, or counter emerging methods of attack, thus enhancing the productivity of support personnel.  Product usage data can be reported to the cloud over a secured connection to help in diagnosing problems and designing new products, thus increasing the opportunity for product service, positive customer interactions, and future development. The Azure Sphere Security Service is an integral aspect of Azure Sphere. Using this service, Azure Sphere MCUs safely and securely connect to the cloud and web. The service ensures that the device boots only with an authorized version of genuine, approved software. In addition, it provides a secured channel through which Microsoft can automatically download and install OS updates to deployed devices in the field to mitigate security problems. Neither manufacturer nor end-user intervention is required, thus closing a common security hole.
  • 32. 191AIE503T CLOUD COMPUTING UNIT - IV Azure Sphere consists of three main parts: Secured Micro-controller Unit (MCU) The first part is a crossover class of MCU with built-in Microsoft security technology and connectivity. Each Azure Sphere MCU includes a wireless communications subsystem that facilitates an internet connection. It is worth mentioning that the Sphere’s MCU provides a kind of a hardware firewall or “sandbox” that ensures that only certain I/O peripherals are accessible to the core to which they are mapped. Consequently, you cannot connect any sensors without first declaring them. The application processor also features an ARM Cortex-A subsystem, responsible for executing the operating system, applications and services. It supports two operating environments:  Normal World (NW) – executes code in both user mode and supervisor mode  Secure World (SW) – executes only the Microsoft-supplied Security Monitor. Secured OS The second component is a highly-secured OS from Microsoft with a custom kernel running on top of Microsoft’s Security Monitor. This creates a trustworthy defense in depth platform. The purpose of the OS services is two-fold: to host the application container, and to facilitate the communication with the Azure Sphere Security Service described further. These services manage Wi-Fi authentication, including network firewall for all outbound traffic.
  • 33. 191AIE503T CLOUD COMPUTING UNIT - IV Cloud Security The Azure Sphere Security Service guards every Azure Sphere device by renewing security, identifying emerging threats, and brokering trust among devices and the cloud. It also provides certificate-based authentication. Additionally, the remote attestation service connects with the device to test if it booted with the correct software, including its version. Furthermore, the Security Service distributes automatic updates for all Microsoft-supplied Azure Sphere OS and OEM software. As a result, manufacturers can securely update their devices remotely without having to worry about whether any update is falsified. Finally, there is a small crash-reporting module which provides crash reporting for deployed software. How does Azure Sphere work in practice? You might wonder how to use Azure Sphere in a real-life scenario. Let’s say that our company, Predica, is a manufacturer of washing machines. In our example, Predica provides high-class, intelligent washing machines that users can remotely control from a mobile app. Each washing machine has an embedded Azure Sphere MCU. Predica has a software development team responsible for developing both software for the washing machines, as well as the mobile application. There is also a support team responsible for maintenance and detection of potential errors. Take a look at the diagram below that visualizes the scenario:
  • 34. 191AIE503T CLOUD COMPUTING UNIT - IV As you can see, there are three main parties in the network:
  • 35. 191AIE503T CLOUD COMPUTING UNIT - IV  Microsoft – handles the security aspect. The Azure Sphere Security Service is used to send system updates automatically, so Predica as the manufacturer does not have to worry about them  Predica software team – develops and releases revisions of software for the washing machines, which is uploaded to the devices using Microsoft Azure cloud services  Predica support team – responsible for maintenance, checking the system and application versions on each washer, as well as detecting possible issues. Azure Sphere provides a way to monitor and control all devices in a secured and centralized way. This is the real power of this solution. How to begin your journey with Azure Sphere? The Azure Sphere Development Board (hardware) is already available to you. You can order it from the Seeed Studio online store. However, once you receive the board, there a few additional things that you will need in to get started:  Visual Studio 2017 IDE – Enterprise, Professional or Community, version 15.7 or later  A PC running Windows 10 Anniversary Update or later  Azure Sphere SDK Preview for Visual Studio  An unused USB port on the PC. It is important to note that at this time the tools for Azure Sphere are still in preview. You do not require a Microsoft Azure cloud subscription to use Azure Sphere and start development. Azure Sphere and the seven properties of highly secured devices A primary goal of the Azure Sphere platform is to provide high-value security at a low cost, so that price- sensitive, microcontroller-powered devices can safely and reliably connect to the internet. As network- connected toys, appliances, and other consumer devices become commonplace, security is of utmost importance. Not only must the device hardware itself be secured, its software and its cloud connections must also be secured. A security lapse anywhere in the operating environment threatens the entire product and, potentially, anything or anyone nearby. Based on Microsoft's decades of experience with internet security, the Azure Sphere team has identified seven properties of highly secured devices. The Azure Sphere platform is designed around these seven properties: Hardware-based root of trust. A hardware-based root of trust ensures that the device and its identity cannot be separated, thus preventing device forgery or spoofing. Every Azure Sphere MCU is identified by an unforgeable cryptographic key that is generated and protected by the Microsoft-designed Pluton security subsystem hardware. This ensures a tamper-resistant, secured hardware root of trust from factory to end user. Defense in depth. Defense in depth provides for multiple layers of security and thus multiple mitigations against each threat. Each layer of software in the Azure Sphere platform verifies that the layer above it is secured.
  • 36. 191AIE503T CLOUD COMPUTING UNIT - IV Small trusted computing base. Most of the device's software remains outside the trusted computing base, thus reducing the surface area for attacks. Only the secured Security Monitor, Pluton runtime, and Pluton subsystem—all of which Microsoft provides—run on the trusted computing base. Dynamic compartments. Dynamic compartments limit the reach of any single error. Azure Sphere MCUs contain silicon counter-measures, including hardware firewalls, to prevent a security breach in one component from propagating to other components. A constrained, "sandboxed" runtime environment prevents applications from corrupting secured code or data. Password-less authentication. The use of signed certificates, validated by an unforgeable cryptographic key, provides much stronger authentication than passwords. The Azure Sphere platform requires every software element to be signed. Device-to-cloud and cloud-to-device communications require further authentication, which is achieved with certificates. Error reporting. Errors in device software or hardware are typical in emerging security attacks; errors that result in device failure constitute a denial-of-service attack. Device-to-cloud communication provides early warning of potential errors. Azure Sphere devices can automatically report operational data and errors to a cloud-based analysis system, and updates and servicing can be performed remotely. Renewable security. The device software is automatically updated to correct known vulnerabilities or security breaches, requiring no intervention from the product manufacturer or the end user. The Azure Sphere Security Service updates the Azure Sphere OS and your applications automatically. Azure Sphere architecture Working together, the Azure Sphere hardware, software, and Security Service enable unique, integrated approaches to device maintenance, control, and security. The hardware architecture provides a fundamentally secured computing base for connected devices, allowing you to focus on your product. The software architecture, with a secured custom OS kernel running atop the Microsoft-written Security Monitor, similarly enables you to concentrate your software efforts on value-added IoT and device-specific features. The Azure Sphere Security Service supports authentication, software updates, and error reporting over secured cloud-to-device and device-to-cloud channels. The result is a secured communications infrastructure that ensures that your products are running the most up-to-date Azure Sphere OS. For architecture diagrams and examples of cloud architectures, see Browse Azure Architectures. Hardware architecture An Azure Sphere crossover MCU consists of multiple cores on a single die, as the following figure shows. Azure Sphere MCU hardware architecture Each core, and its associated subsystem, is in a different trust domain. The root of trust resides in the Pluton security subsystem. Each layer of the architecture assumes that the layer above it may be compromised. Within each layer, resource isolation and dynamic compartments provide added security.
  • 37. 191AIE503T CLOUD COMPUTING UNIT - IV Microsoft Pluton security subsystem The Pluton security subsystem is the hardware-based (in silicon) secured root of trust for Azure Sphere. It includes a security processor core, cryptographic engines, a hardware random number generator, public/private key generation, asymmetric and symmetric encryption, support for elliptic curve digital signature algorithm (ECDSA) verification for secured boot, and measured boot in silicon to support remote attestation with a cloud service, as well as various tampering counter-measures including an entropy detection unit. As part of the secured boot process, the Pluton subsystem boots various software components. It also provides runtime services, processes requests from other components of the device, and manages critical components for other parts of the device. High-level application core The high-level application core features an ARM Cortex-A subsystem that has a full memory management unit (MMU). It enables hardware-based compartmentalization of processes by using trust zone functionality and is responsible for running the operating system, high-level applications, and services. It supports two operating environments: Normal World (NW), which runs code in both user mode and supervisor mode, and Secure World (SW), which runs only the Microsoft-supplied Security Monitor. Your high-level applications run in NW user mode. Real-time cores The real-time cores feature an ARM Cortex-M I/O subsystem that can run real-time capable applications as either bare-metal code or a real-time operating system (RTOS). Such applications can map peripherals and communicate with high-level applications but cannot access the internet directly. Connectivity and communications The first Azure Sphere MCU provides an 802.11 b/g/n Wi-Fi radio that operates at both 2.4GHz and 5GHz. High-level applications can configure, use, and query the wireless communications subsystem, but they cannot program it directly. In addition to or instead of using Wi-Fi, Azure Sphere devices that are properly equipped can communicate on an Ethernet network. Multiplexed I/O The Azure Sphere platform supports a variety of I/O capabilities, so that you can configure embedded devices to suit your market and product requirements. I/O peripherals can be mapped to either the high- level application core or to a real-time core. Microsoft firewalls Hardware firewalls are silicon countermeasures that provide "sandbox" protection to ensure that I/O peripherals are accessible only to the core to which they are mapped. The firewalls impose compartmentalization, thus preventing a security threat that is localized in the high-level application core from affecting the real-time cores' access to their peripherals.
  • 38. 191AIE503T CLOUD COMPUTING UNIT - IV Integrated RAM and flash Azure Sphere MCUs include a minimum of 4MB of integrated RAM and 16MB of integrated flash memory. Software architecture and OS The high-level application platform runs the Azure Sphere OS along with a device-specific high-level application that can communicate both with the internet and with real-time capable applications that run on the real-time cores. The following figure shows the elements of this platform. Microsoft-supplied elements are shown in gray. High-level Application Platform Microsoft provides and maintains all software other than your device-specific applications. All software that runs on the device, including the high-level application, is signed by the Microsoft certificate authority (CA). Application updates are delivered through the trusted Microsoft pipeline, and the compatibility of each update with the Azure Sphere device hardware is verified before installation. Application runtime The Microsoft-provided application runtime is based on a subset of the POSIX standard. It consists of libraries and runtime services that run in NW user mode. This environment supports the high-level applications that you create. Application libraries support networking, storage, and communications features that are required by high- level applications but do not support direct generic file I/O or shell access, among other constraints. These restrictions ensure that the platform remains secured and that Microsoft can provide security and maintenance updates. In addition, the constrained libraries provide a long-term stable API surface so that system software can be updated to enhance security while retaining binary compatibility for applications. OS services OS services host the high-level application container and are responsible for communicating with the Azure Sphere Security Service. They manage network authentication and the network firewall for all outbound traffic. During development, OS services also communicate with a connected PC and the application that is being debugged. Custom Linux kernel The custom Linux-based kernel runs in supervisor mode, along with a boot loader. The kernel is carefully tuned for the flash and RAM footprint of the Azure Sphere MCU. It provides a surface for preemptable execution of user-space processes in separate virtual address spaces. The driver model exposes MCU peripherals to OS services and applications. Azure Sphere drivers include Wi-Fi (which includes a TCP/IP networking stack), UART, SPI, I2C, and GPIO, among others.
  • 39. 191AIE503T CLOUD COMPUTING UNIT - IV Security Monitor The Microsoft-supplied Security Monitor runs in SW. It is responsible for protecting security-sensitive hardware, such as memory, flash, and other shared MCU resources and for safely exposing limited access to these resources. The Security Monitor brokers and gates access to the Pluton Security Subsystem and the hardware root of trust and acts as a watchdog for the NW environment. It starts the boot loader, exposes runtime services to NW, and manages hardware firewalls and other silicon components that are not accessible to NW. Azure Sphere Security Service The Azure Sphere Security Service comprises three components: password-less authentication, update, and error reporting.  Password-less authentication. The authentication component provides remote attestation and password-less authentication. The remote attestation service connects via a challenge-response protocol that uses the measured boot feature on the Pluton subsystem. It verifies not merely that the device booted with the correct software, but with the correct version of that software. After attestation succeeds, the authentication service takes over. The authentication service communicates over a secured TLS connection and issues a certificate that the device can present to a web service, such as Microsoft Azure or a company's private cloud. The web service validates the certificate chain, thus verifying that the device is genuine, that its software is up to date, and that Microsoft is its source. The device can then connect safely and securely with the online service.  Update. The update service distributes automatic updates for the Azure Sphere OS and for applications. The update service ensures continued operation and enables the remote servicing and update of application software.  Error reporting. The error reporting service provides simple crash reporting for deployed software. To obtain richer data, use the reporting and analysis features that are included with a Microsoft Azure subscription. Azure Cloud shell and Mobile Apps Azure Cloud Shell is another service under the Microsoft banner that enables you to have a Bash or PowerShell console without changing your browser. Since the service is browser-based, there's no problem about having a local setup run for the two platforms. Azure Cloud Shell is basically what the cloud is integrated with. There's no point in worrying about the underlying infrastructure if your only focus is on the console. With Azure Cloud Shell, the key is to develop and manage Azure resources in a friendlier environment. The service offers a pre-configured, browser-accessible shell experience to take care of the Azure resources without incurring an additional cost of machine maintenance, versioning, and installation. And since the whole idea is to provide interactive sessions through Cloud Shell, the machine that works on a per-request basis automatically terminates the activity if left idle for 20 minutes. The latest upgrades enable Azure Cloud Shell to run on Ubuntu 16.04 LTS. Getting Started with Azure Cloud Shell
  • 40. 191AIE503T CLOUD COMPUTING UNIT - IV The Cloud Shell service can be used within the Azure Container service based on your subscription type. Not every subscriber have to pay for the storage account separately. If your subscription allows, it can be created and associated with the current package. Also, the storage account is tied to the Cloud Shell and can be used right away. The container is mounted under the PowerShell user profile. In short, the Azure Cloud Shell is your Microsoft-managed admin machine in Azure which enables you to:  Get authentic virtual access to Azure Shell from anywhere in the world  Use common programming languages and tools in a Shell that's maintained and updated by Microsoft  Persists your data files across sessions in Azure files With Azure, you have the flexibility to choose according to the preferred shell experience that perfectly matches the way you work. Both PowerShell and Bash experiences are available. Microsoft Azure Cloud Shell Important Features Here are the top most important features associated with Azure Cloud Shell: Automatic Authentication for Improved Security Cloud shell automatically and securely authenticates account access for PowerShell and Azure CLI. This means that the interactive session will terminate if the shell inactivity persists for more than 20 minutes. This automatic feature help improves security. Persistence Across Sessions To help the user with a stick with the files across sessions, you get a walk through with Cloud Shell, that instantly attaches on Azure file share right on the launch. After the session is completed, the Cloud Shell will attach itself to your storage and persist for all the sessions in the future. Moreover, your home directory is saved as a .img file in your Azure File share. The files that are outside of the machine state or home directory are not persisted across sessions. It is best to refer to the best practices for Cloud Shell for storing secrets like the SSH keys. Virtual Access from Anywhere The service allows you to connect to Azure platform using a browser-based, authenticated shell experience that is hosted in the cloud and can be accessed from anywhere. The Cloud Shell service can be utilized by a unique user as per the automatic assignment. The user account is then authenticated for each session for increased security. To enjoy a modern CLI experience using multiple access points - including Azure mobile app, shell.azure.com, Azure docs (such as Azure PowerShell, Azure CLI), Azure portal, and VS Code Azure Account Extension. Common Programming Languages and Tools Just like any other component of the Microsoft, the platform regularly updates and maintains the Cloud Shell. The browser-based service naturally comes common CLI tools, which include PowerShell modules, Linux Shell interpreters, source control, text editors, Azure tools, container tools, build tools, database tools,
  • 41. 191AIE503T CLOUD COMPUTING UNIT - IV and many more. On the other hand, Cloud Shell also works with a number of supportive programming languages. The most popular ones include Python, .NET, and Node.js. Azure Drive Cloud Shell in PowerShell begins in the Azure Drive. This enables you to navigate through the entire range of Azure resources including Storage, Network, and Compute among the rest. The process of discovery and navigation are similar to filesystem navigation. However, the drive really doesn't matter as you can still manage the resources using Azure PowerShell cmdlets. Whatever changes you make to the Azure resources will be reflected in the drive right away. To refresh the resources, run dir-Force. Configured and Authenticated Azure Workstation Naturally, one cannot deny the security and authentication of Cloud Shell as it works under the most reliable name, Microsoft. In fact, Microsoft manages the Cloud Shell and ensures popular language support and command-line tools as mentioned earlier. Cloud Shell is also responsible for securely authenticating the instant and automatic access to the resources using Azure CLI. Seamless Deployment One of the latest updates of Cloud Shell is the graphical text editor. The feature is integrated based on the open-source called the Monaco Editor. The feature enables you to create and customize files by running code. This helps with seamless and smooth deployment through Azure PowerShell or Azure CLI 2.0. As far as the pricing is concerned the Cloud Shell machine hosting services are free. These services are a pre-requisite of a mounted Azure Files share. However, to access all the features and to utilize the storage, the regular cost may apply. The best way to get the hang of it and to use it for maximum benefits, it is best to get Azure training and Azure certification for a more detailed understanding of Azure Cloud Shell. Azure Mobile Apps Azure Mobile Apps (also known as the Microsoft Data sync Framework) gives enterprise developers and system integrators a mobile-application development platform that's highly scalable and globally available. The framework provides your mobile app with:  Authentication  Data query  Offline data synchronization
  • 42. 191AIE503T CLOUD COMPUTING UNIT - IV Azure Mobile Apps is designed to work with Azure App Service. Since it's based on ASP.NET 6, it can also be run as a container in Azure Container Apps or Azure Kubernetes Service. Why Mobile Apps? With the Mobile Apps SDKs, you can:  Build native and cross-platform apps: Build cloud-enabled apps for Android™, iOS, or Windows using native SDKs.  Connect to your enterprise systems: Authenticate your users with Azure Active Directory, and connect to enterprise data stores.  Build offline-ready apps with data sync: Make your mobile workforce more productive by building apps that work offline. Use Azure Mobile Apps to sync data in the background. Azure Mobile Apps features The following features are important to cloud-enabled mobile development:  Authentication and authorization: Use Azure Mobile Apps to sign-in users using social and enterprise provides. Azure App Service supports Azure Active Directory, Facebook™, Google®, Microsoft, Twitter®, and OpenID Connect®. Azure Mobile Apps supports any authentication scheme that is supported by ASP.NET Core.  Data access: Azure Mobile Apps provides a mobile-friendly OData v4 data source that's linked to a compatible database via Entity Framework Core. Any compatible database can be used including Azure SQL, Azure Cosmos DB, or an on-premises Microsoft SQL Server.
  • 43. 191AIE503T CLOUD COMPUTING UNIT - IV  Offline sync: Build robust and responsive mobile applications that operate with an offline dataset. You can sync this dataset automatically with service, and handle conflicts with ease.  Client SDKs: There's a complete set of client SDKs that cover cross-platform development (.NET, and Apache Cordova™). Each client SDK is available with an MIT license and is open-source. Azure App Service features The following platform features are useful for mobile production sites:  Autoscaling: With App Service, you can quickly scale up or scale out to handle any incoming customer load. Manually select the number and size of VMs, or set up autoscaling to scale your service based on load or schedule.  Staging environments: App Service can run multiple versions of your site. You can perform A/B testing and do in-place staging of a new mobile service.  Continuous deployment: App Service can integrate with common source control management (SCM) systems, allowing you to easily deploy a new version of your mobile service.  Virtual networking: App Service can connect to on-premises resources by using virtual network, Azure ExpressRoute, or hybrid connections.  Isolated and dedicated environments: For securely running Azure App Service apps, you can run App Service in a fully isolated and dedicated environment. This environment is ideal for application workloads that require high scale, isolation, or secure network access.