SlideShare a Scribd company logo
Elena López
MicrosoftMVP – DataPlatform
elopez@solvex.com.do
www.solvex.com.do
Arquitectura de Datos en Azure
más allá de Data Factory, Power BI y Azure SQL
Database
Azure Data Services
Advanced Analytics
Social
LOB
Graph
IoT
Image
CRM
INGEST STORE PREP MODEL & SERVE
(& store)
Data orchestration
and monitoring
Big data store Transform & Clean Data warehouse
AI
BI + Reporting
Azure Data Factory
SSIS
Azure Data Lake
Storage Gen2
Blob Storage
Cosmos DB
Azure Databricks
Azure HDInsight
Power BI Dataflow
Azure Data Lake Analytics
Azure SQL Data Warehouse
Azure Analysis Services
Cosmos DB
Power BI Aggregations
A “no-compromises” Data Lake: secure, performant, massively-scalable Data Lake storage that brings the cost and
scale profile of object storage together with the performance and analytics feature set of data lake storage
Azure Data Lake Storage Gen2
M A N A G E A B L E S C A L A B L EF A S TS E C U R E
 No limits on
data store size
 Global footprint
(50 regions)
 Optimized for Spark
and Hadoop
Analytic Engines
 Tightly integrated
with Azure end to
end analytics
solutions
 Automated
Lifecycle Policy
Management
 Object Level
tiering
 Support for fine-
grained ACLs,
protecting data at the
file and folder level
 Multi-layered
protection via at-rest
Storage Service
encryption and Azure
Active Directory
integration
C O S T
E F F E C T I V E
I N T E G R AT I O N
R E A D Y
 Atomic file
operations
means jobs
complete faster
 Object store
pricing levels
 File system
operations
minimize
transactions
required for job
completion
Objectives
 Plan the structure based on optimal data retrieval
 Avoid a chaotic, unorganized data swamp
Data Retention Policy
Temporary data
Permanent data
Applicable period (ex: project lifetime)
etc…
Business Impact / Criticality
High (HBI)
Medium (MBI)
Low (LBI)
etc…
Confidential Classification
Public information
Internal use only
Supplier/partner confidential
Personally identifiable information (PII)
Sensitive – financial
Sensitive – intellectual property
etc…
Probability of Data Access
Recent/current data
Historical data
etc…
Owner / Steward / SME
Subject Area
Security Boundaries
Department
Business unit
etc…
Time Partitioning
Year/Month/Day/Hour/Minute
Downstream App/Purpose
Common ways to organize the data:
Organizing a Data Lake – Folder structure
Automated Machine Learning
Architecture
Automated ML
Power BI
Dashboard
Data for
Real-time
Processing
Data Stream
Job
Hourly Prediction
Updates
External Data Azure Services
Send to Azure SQL
for predictions
Get Data
Azure WebJob
Runs jobs to get data
from public source
Azure SQL
Contains Historical Energy
Consumption & Weather Data
Real time
data stats
Azure Data Factory
Pipeline invokes AML
Web Service
Energy Consumption
Data & Weather Data
(Public Source)
Azure Event Hub
stores streaming
data
Azure Stream Analytics
processes events as they
arrive in the Event Hub
Power BI
Dashboard
Data for
Real-time
Processing
Data Stream
Job
Hourly Prediction
Updates
Get Data
Azure WebJob
Runs jobs to get data
from public source
Real time
data stats
SQL DB
Cosmos DB
Datawarehouse
Data lake
Blob storage
… Prepare Data Build & Train Deploy
Machine Learning Process
How much is this car worth?
Machine Learning Problem Example
Model Creation Is Typically Time-Consuming
Mileage
Condition
Car brand
Year of make
Regulations
…
Parameter 1
Parameter 2
Parameter 3
Parameter 4
…
Gradient Boosted
Nearest Neighbors
SVM
Bayesian Regression
LGBM
…
Mileage Gradient Boosted Criterion
Loss
Min Samples Split
Min Samples Leaf
Others Model
Which algorithm? Which parameters?Which features?
Car brand
Year of make
Criterion
Loss
Min Samples Split
Min Samples Leaf
Others
N Neighbors
Weights
Metric
P
Others
Which algorithm? Which parameters?Which features?
Mileage
Condition
Car brand
Year of make
Regulations
…
Gradient Boosted
Nearest Neighbors
SVM
Bayesian Regression
LGBM
…
Nearest Neighbors
Model
Iterate
Gradient BoostedMileage
Car brand
Year of make
Car brand
Year of make
Condition
Model Creation Is Typically Time-Consuming
Which algorithm? Which parameters?Which features?
Iterate
Model Creation Is Typically Time-Consuming
Introducing Automated Machine Learning
Dataset
Optimization
Metric
Constraints
(Time/Cost)
ML ModelAutomated ML
Accessible & Faster
Enter data
Define goals
Apply constraints
Output
Automated ML Accelerates Model Development
Input Intelligently test multiple models in parallel
Optimized model
Automated ML Capabilities
• Based on Microsoft Research
• Brain trained with several
million experiments
• Collaborative filtering and
Bayesian optimization
• Privacy preserving: No need
to “see” the data
Automated ML Capabilities
• ML Scenarios: Classification &
Regression
• Integration: Azure Machine
Learning, Azure Notebooks,
Jupyter Notebooks
• Data Type: Numeric, Text
• Languages: Python SDK for
deployment and hosting for
inference
• Training Compute: Local Machine,
Remote Azure DSVM (Linux),
Azure Batch AI
• Transparency: View run history,
model metrics
• Scale: Faster model training using
multiple cores and parallel
experiments
GA:
• Feature importance as part of
training
• Simple UX for feature importance
for a selected iteration
• Local feature importance for a
given sample
Post GA:
• Importance of Raw data columns
• Accuracy and performance
improvements
Model Explain-ability
File – New Project
Let’s do it!
1. Download Azure Storage Explorer
2. Save for later
Server: demoml.database.windows.net
User: elopez
3. Free temporary azure account
User: lab@solvex.com.do
Pass: Dac93748

More Related Content

What's hot

SAP HANA Database
SAP HANA DatabaseSAP HANA Database
SAP HANA Database
Mayuree Srikulwong
 
MLFlow as part of ML CI/CD at Avalara
MLFlow as part of ML CI/CD at AvalaraMLFlow as part of ML CI/CD at Avalara
MLFlow as part of ML CI/CD at Avalara
Manoj Mahalingam
 
Primer on Power BI 201506
Primer on Power BI 201506Primer on Power BI 201506
Primer on Power BI 201506
Mark Tabladillo
 
Introduction to Big Data using AWS Services
Introduction to Big Data using AWS ServicesIntroduction to Big Data using AWS Services
Introduction to Big Data using AWS Services
Anjani Phuyal
 
Big Data, HPC and Streaming
Big Data, HPC and StreamingBig Data, HPC and Streaming
Big Data, HPC and Streaming
Anjani Phuyal
 
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Amazon Web Services
 
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
Artificial intelligence in actions: delivering a new experience to Formula 1 ...Artificial intelligence in actions: delivering a new experience to Formula 1 ...
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
GoDataDriven
 
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Ankit Rathi
 
Spark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu AdunuthulaSpark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu Adunuthula
Spark Summit
 
Azure plug & play architecture
Azure   plug & play architectureAzure   plug & play architecture
Azure plug & play architecture
Steef-Jan Wiggers
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
Digikrit
 
20160317 - PAZUR - PowerBI & R
20160317  - PAZUR - PowerBI & R20160317  - PAZUR - PowerBI & R
20160317 - PAZUR - PowerBI & R
Łukasz Grala
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
Kellyn Pot'Vin-Gorman
 
CPBIG - A Deep Dive into Power BI
CPBIG - A Deep Dive into Power BICPBIG - A Deep Dive into Power BI
CPBIG - A Deep Dive into Power BI
HARIHARAN R
 
Leveraging Microsoft Power BI To Support Enterprise Business Intelligence
Leveraging Microsoft Power BI To Support Enterprise Business IntelligenceLeveraging Microsoft Power BI To Support Enterprise Business Intelligence
Leveraging Microsoft Power BI To Support Enterprise Business Intelligence
Rightpoint
 
Innovation with AWS on : Big Data Analytics
Innovation with AWS on : Big Data AnalyticsInnovation with AWS on : Big Data Analytics
Innovation with AWS on : Big Data Analytics
Amazon Web Services
 
Big Data and ML on Google Cloud
Big Data and ML on Google CloudBig Data and ML on Google Cloud
Big Data and ML on Google Cloud
Wlodek Bielski
 
Big Data Expo 2015 - Microsoft Transform you data into intelligent action
Big Data Expo 2015 - Microsoft Transform you data into intelligent actionBig Data Expo 2015 - Microsoft Transform you data into intelligent action
Big Data Expo 2015 - Microsoft Transform you data into intelligent action
BigDataExpo
 
ML with Power BI for Business and Pros
ML with Power BI for Business and ProsML with Power BI for Business and Pros
ML with Power BI for Business and Pros
Ivo Andreev
 
Scalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and HowScalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and How
Cambridge Semantics
 

What's hot (20)

SAP HANA Database
SAP HANA DatabaseSAP HANA Database
SAP HANA Database
 
MLFlow as part of ML CI/CD at Avalara
MLFlow as part of ML CI/CD at AvalaraMLFlow as part of ML CI/CD at Avalara
MLFlow as part of ML CI/CD at Avalara
 
Primer on Power BI 201506
Primer on Power BI 201506Primer on Power BI 201506
Primer on Power BI 201506
 
Introduction to Big Data using AWS Services
Introduction to Big Data using AWS ServicesIntroduction to Big Data using AWS Services
Introduction to Big Data using AWS Services
 
Big Data, HPC and Streaming
Big Data, HPC and StreamingBig Data, HPC and Streaming
Big Data, HPC and Streaming
 
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
 
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
Artificial intelligence in actions: delivering a new experience to Formula 1 ...Artificial intelligence in actions: delivering a new experience to Formula 1 ...
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
 
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
 
Spark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu AdunuthulaSpark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu Adunuthula
 
Azure plug & play architecture
Azure   plug & play architectureAzure   plug & play architecture
Azure plug & play architecture
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
 
20160317 - PAZUR - PowerBI & R
20160317  - PAZUR - PowerBI & R20160317  - PAZUR - PowerBI & R
20160317 - PAZUR - PowerBI & R
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
 
CPBIG - A Deep Dive into Power BI
CPBIG - A Deep Dive into Power BICPBIG - A Deep Dive into Power BI
CPBIG - A Deep Dive into Power BI
 
Leveraging Microsoft Power BI To Support Enterprise Business Intelligence
Leveraging Microsoft Power BI To Support Enterprise Business IntelligenceLeveraging Microsoft Power BI To Support Enterprise Business Intelligence
Leveraging Microsoft Power BI To Support Enterprise Business Intelligence
 
Innovation with AWS on : Big Data Analytics
Innovation with AWS on : Big Data AnalyticsInnovation with AWS on : Big Data Analytics
Innovation with AWS on : Big Data Analytics
 
Big Data and ML on Google Cloud
Big Data and ML on Google CloudBig Data and ML on Google Cloud
Big Data and ML on Google Cloud
 
Big Data Expo 2015 - Microsoft Transform you data into intelligent action
Big Data Expo 2015 - Microsoft Transform you data into intelligent actionBig Data Expo 2015 - Microsoft Transform you data into intelligent action
Big Data Expo 2015 - Microsoft Transform you data into intelligent action
 
ML with Power BI for Business and Pros
ML with Power BI for Business and ProsML with Power BI for Business and Pros
ML with Power BI for Business and Pros
 
Scalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and HowScalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and How
 

Similar to Arquitectura de Datos en Azure

Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
James Serra
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
Elena Lopez
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine Learning
James Serra
 
Integrating technology to your startup
Integrating technology to your startupIntegrating technology to your startup
Integrating technology to your startup
Ruth Yakubu
 
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Amazon Web Services
 
Deep Learning Technical Pitch Deck
Deep Learning Technical Pitch DeckDeep Learning Technical Pitch Deck
Deep Learning Technical Pitch Deck
Nicholas Vossburg
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
Amazon Web Services
 
AnalysisServices
AnalysisServicesAnalysisServices
AnalysisServiceswebuploader
 
Auckland Summit Keynote
Auckland Summit KeynoteAuckland Summit Keynote
Auckland Summit Keynote
Amazon Web Services
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
Collective Intelligence Inc.
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Amazon Web Services
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric Introduction
James Serra
 
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon MeichtryAWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
Amazon Web Services Korea
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
Roy Kim
 
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
DataWorks Summit
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Amazon Web Services
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
James Serra
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 

Similar to Arquitectura de Datos en Azure (20)

Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine Learning
 
Integrating technology to your startup
Integrating technology to your startupIntegrating technology to your startup
Integrating technology to your startup
 
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
 
Deep Learning Technical Pitch Deck
Deep Learning Technical Pitch DeckDeep Learning Technical Pitch Deck
Deep Learning Technical Pitch Deck
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
AnalysisServices
AnalysisServicesAnalysisServices
AnalysisServices
 
Auckland Summit Keynote
Auckland Summit KeynoteAuckland Summit Keynote
Auckland Summit Keynote
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric Introduction
 
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon MeichtryAWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
 
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 

More from Elena Lopez

Data driven decision making
Data driven decision makingData driven decision making
Data driven decision making
Elena Lopez
 
Data pipeline
Data pipelineData pipeline
Data pipeline
Elena Lopez
 
Data analytics on Azure
Data analytics on AzureData analytics on Azure
Data analytics on Azure
Elena Lopez
 
El valor de los datos 3.0
El valor de los datos 3.0El valor de los datos 3.0
El valor de los datos 3.0
Elena Lopez
 
Gestión de Proyectos de Ciencia de Datos
Gestión de Proyectos de Ciencia de DatosGestión de Proyectos de Ciencia de Datos
Gestión de Proyectos de Ciencia de Datos
Elena Lopez
 
Analitica avanzada
Analitica avanzadaAnalitica avanzada
Analitica avanzada
Elena Lopez
 
Inteligencia de negocios - tercera parte
Inteligencia de negocios  - tercera parteInteligencia de negocios  - tercera parte
Inteligencia de negocios - tercera parte
Elena Lopez
 
Consideraciones para un buen diseño lógico de base de datos
Consideraciones para un buen diseño lógico de base de datosConsideraciones para un buen diseño lógico de base de datos
Consideraciones para un buen diseño lógico de base de datos
Elena Lopez
 
El valor de los datos
El valor de los datosEl valor de los datos
El valor de los datos
Elena Lopez
 
1. introduccion las Bases de Datos
1. introduccion las Bases de Datos1. introduccion las Bases de Datos
1. introduccion las Bases de Datos
Elena Lopez
 
1. introduccion a las Bases de datos
1. introduccion a las Bases de datos1. introduccion a las Bases de datos
1. introduccion a las Bases de datos
Elena Lopez
 

More from Elena Lopez (11)

Data driven decision making
Data driven decision makingData driven decision making
Data driven decision making
 
Data pipeline
Data pipelineData pipeline
Data pipeline
 
Data analytics on Azure
Data analytics on AzureData analytics on Azure
Data analytics on Azure
 
El valor de los datos 3.0
El valor de los datos 3.0El valor de los datos 3.0
El valor de los datos 3.0
 
Gestión de Proyectos de Ciencia de Datos
Gestión de Proyectos de Ciencia de DatosGestión de Proyectos de Ciencia de Datos
Gestión de Proyectos de Ciencia de Datos
 
Analitica avanzada
Analitica avanzadaAnalitica avanzada
Analitica avanzada
 
Inteligencia de negocios - tercera parte
Inteligencia de negocios  - tercera parteInteligencia de negocios  - tercera parte
Inteligencia de negocios - tercera parte
 
Consideraciones para un buen diseño lógico de base de datos
Consideraciones para un buen diseño lógico de base de datosConsideraciones para un buen diseño lógico de base de datos
Consideraciones para un buen diseño lógico de base de datos
 
El valor de los datos
El valor de los datosEl valor de los datos
El valor de los datos
 
1. introduccion las Bases de Datos
1. introduccion las Bases de Datos1. introduccion las Bases de Datos
1. introduccion las Bases de Datos
 
1. introduccion a las Bases de datos
1. introduccion a las Bases de datos1. introduccion a las Bases de datos
1. introduccion a las Bases de datos
 

Recently uploaded

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 

Recently uploaded (20)

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 

Arquitectura de Datos en Azure

  • 1. Elena López MicrosoftMVP – DataPlatform elopez@solvex.com.do www.solvex.com.do Arquitectura de Datos en Azure más allá de Data Factory, Power BI y Azure SQL Database
  • 2. Azure Data Services Advanced Analytics Social LOB Graph IoT Image CRM INGEST STORE PREP MODEL & SERVE (& store) Data orchestration and monitoring Big data store Transform & Clean Data warehouse AI BI + Reporting Azure Data Factory SSIS Azure Data Lake Storage Gen2 Blob Storage Cosmos DB Azure Databricks Azure HDInsight Power BI Dataflow Azure Data Lake Analytics Azure SQL Data Warehouse Azure Analysis Services Cosmos DB Power BI Aggregations
  • 3. A “no-compromises” Data Lake: secure, performant, massively-scalable Data Lake storage that brings the cost and scale profile of object storage together with the performance and analytics feature set of data lake storage Azure Data Lake Storage Gen2 M A N A G E A B L E S C A L A B L EF A S TS E C U R E  No limits on data store size  Global footprint (50 regions)  Optimized for Spark and Hadoop Analytic Engines  Tightly integrated with Azure end to end analytics solutions  Automated Lifecycle Policy Management  Object Level tiering  Support for fine- grained ACLs, protecting data at the file and folder level  Multi-layered protection via at-rest Storage Service encryption and Azure Active Directory integration C O S T E F F E C T I V E I N T E G R AT I O N R E A D Y  Atomic file operations means jobs complete faster  Object store pricing levels  File system operations minimize transactions required for job completion
  • 4. Objectives  Plan the structure based on optimal data retrieval  Avoid a chaotic, unorganized data swamp Data Retention Policy Temporary data Permanent data Applicable period (ex: project lifetime) etc… Business Impact / Criticality High (HBI) Medium (MBI) Low (LBI) etc… Confidential Classification Public information Internal use only Supplier/partner confidential Personally identifiable information (PII) Sensitive – financial Sensitive – intellectual property etc… Probability of Data Access Recent/current data Historical data etc… Owner / Steward / SME Subject Area Security Boundaries Department Business unit etc… Time Partitioning Year/Month/Day/Hour/Minute Downstream App/Purpose Common ways to organize the data: Organizing a Data Lake – Folder structure
  • 6. Architecture Automated ML Power BI Dashboard Data for Real-time Processing Data Stream Job Hourly Prediction Updates External Data Azure Services Send to Azure SQL for predictions Get Data Azure WebJob Runs jobs to get data from public source Azure SQL Contains Historical Energy Consumption & Weather Data Real time data stats Azure Data Factory Pipeline invokes AML Web Service Energy Consumption Data & Weather Data (Public Source) Azure Event Hub stores streaming data Azure Stream Analytics processes events as they arrive in the Event Hub Power BI Dashboard Data for Real-time Processing Data Stream Job Hourly Prediction Updates Get Data Azure WebJob Runs jobs to get data from public source Real time data stats
  • 7. SQL DB Cosmos DB Datawarehouse Data lake Blob storage … Prepare Data Build & Train Deploy Machine Learning Process
  • 8. How much is this car worth? Machine Learning Problem Example
  • 9. Model Creation Is Typically Time-Consuming Mileage Condition Car brand Year of make Regulations … Parameter 1 Parameter 2 Parameter 3 Parameter 4 … Gradient Boosted Nearest Neighbors SVM Bayesian Regression LGBM … Mileage Gradient Boosted Criterion Loss Min Samples Split Min Samples Leaf Others Model Which algorithm? Which parameters?Which features? Car brand Year of make
  • 10. Criterion Loss Min Samples Split Min Samples Leaf Others N Neighbors Weights Metric P Others Which algorithm? Which parameters?Which features? Mileage Condition Car brand Year of make Regulations … Gradient Boosted Nearest Neighbors SVM Bayesian Regression LGBM … Nearest Neighbors Model Iterate Gradient BoostedMileage Car brand Year of make Car brand Year of make Condition Model Creation Is Typically Time-Consuming
  • 11. Which algorithm? Which parameters?Which features? Iterate Model Creation Is Typically Time-Consuming
  • 12. Introducing Automated Machine Learning Dataset Optimization Metric Constraints (Time/Cost) ML ModelAutomated ML Accessible & Faster
  • 13. Enter data Define goals Apply constraints Output Automated ML Accelerates Model Development Input Intelligently test multiple models in parallel Optimized model
  • 14. Automated ML Capabilities • Based on Microsoft Research • Brain trained with several million experiments • Collaborative filtering and Bayesian optimization • Privacy preserving: No need to “see” the data
  • 15. Automated ML Capabilities • ML Scenarios: Classification & Regression • Integration: Azure Machine Learning, Azure Notebooks, Jupyter Notebooks • Data Type: Numeric, Text • Languages: Python SDK for deployment and hosting for inference • Training Compute: Local Machine, Remote Azure DSVM (Linux), Azure Batch AI • Transparency: View run history, model metrics • Scale: Faster model training using multiple cores and parallel experiments
  • 16. GA: • Feature importance as part of training • Simple UX for feature importance for a selected iteration • Local feature importance for a given sample Post GA: • Importance of Raw data columns • Accuracy and performance improvements Model Explain-ability
  • 17. File – New Project Let’s do it!
  • 18. 1. Download Azure Storage Explorer 2. Save for later Server: demoml.database.windows.net User: elopez 3. Free temporary azure account User: lab@solvex.com.do Pass: Dac93748

Editor's Notes

  1. 2
  2. https://www.jamesserra.com/archive/2019/01/what-product-to-use-to-transform-my-data/
  3. https://www.jamesserra.com/archive/2019/01/what-product-to-use-to-transform-my-data/