SlideShare a Scribd company logo
LEVERAGING DATA DRIVEN RESEARCH
THROUGH MICROSOFT AZURE
Dr. Miguel Fierro
Data Scientist at Microsoft
@miguelgfierro
miguel.gonzalezfierro@microsoft.com
https://miguelgfierro.com
Plymouth University | Jan 27, 2017 | Plymouth, UK
AZURE FOR RESEARCH AWARD
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
azurerfp@microsoft.com
Free Azure resources if awarded
Areas: data science, climate, health…
Ex: Alan Turing Institute got $5M
D a t a S c i e n ce V i r t ua l
M a chi ne
A z u re M L S t u d io
S p a r k a n d H a d o o p
w i t h A z u re
OUTLINE
SPARK & HADOOP WITH AZURE
WHAT IS HDINSIGHT
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
HDInsight
Managed Service
MANAGER GUI: AMBARI
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
APACHE HADOOP
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Software for storing and analysing
massive amounts (~Tb) of
structured and unstructured data
APACHE SPARK
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Framework that runs large-scale data analytics applications
pySpark, Spark (Scala), SparkR
100x faster than Hadoop (processing in memory)
APACHE KAFKA
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Stream processing for real time apps
Publisher & subscriber messaging system
Millions of messages per second
APACHE STORM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Distributed framework for real-time applications
ETL, continuous computation, online machine learning
Million of operations per second in each node
APACHE HBASE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Non-relational database (NoSQL) for Big Data applications
Distributed, fast tolerant and scalable
Built on top of HDFS (Hadoop Distributed File System)
APACHE HIVE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
SQL-like language to query data in Hadoop systems
Word count program
EXAMPLE OF ARCHITECTURE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
DEMO: PYSPARK APPLICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Log analysis with PySpark Predictive analysis on food inspection with PySpark
source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-
machine-learning-mllib-ipython
source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-
custom-library-website-log-analysis
AZURE ML STUDIO
WHAT IS AZURE ML STUDIO
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
GUI for Machine Learning
DATA INPUT/OUTPUT
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
DATA TRANSFORMATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
DATA MANIPULATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
FEATURE SELECTION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
CLASSIFICATION & REGRESSION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
TRAINING & SCORING
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
PYTHON & R SCRIPTS
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
AUTOMATIC API
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
DEMO: CREDIT RISK ANOMALY DETECTION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
source: https://gallery.cortanaintelligence.com/Experiment/1219e87f8fb84e88a2e1b54256808bb3
DATA SCIENCE VIRTUAL MACHINE
WHAT IS THE DSVM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Windows:
- Anaconda with python Jupyter notebooks
- Microsoft R Server
- Visual Studio
- SQL Server
- Azure SDK
- Deep learning: CNTK & MXNet
- Machine Learning: XGBoost
Linux:
- Anaconda with python Jupyter notebooks
- Microsoft R Server
- PyCharm
- Azure SDK
- Deep learning: CNTK & MXNet
- Machine Learning: XGBoost, Weka
DEEP LEARNING DSVM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Libs:
- CNTK
- MXNet
- TensorFlow
- Keras
Digit recognition Image recognitionExamples:
NVIDIA TESLA K80
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
AI LANDSCAPE: IMAGES
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
15.4%
7.3%
6.7%
3.6%
3.1%
5.1% (human)
error (%)
ImageNet (image recognition competition) top-5 error
AlexNet
(2012)
VGG
(2014)
Inception
(2015)
ResNet
(2015)
Inception-
ResNet
(2016)
AI LANDSCAPE: SPEECH
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Microsoft Research achieves parity with
human speech level
source: http://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition
CNN
(VGG, ResNet, LACE)
RNN
(Bi-LSTM)
Multi-GPU and multi server
(1-bit Stochastic Gradient Descent)
IMAGE CLASSIFICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
1.
2.
3.
4.
5.
source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-
network-training-using-microsoft-r-server-and-azure-gpu-vms/
IMAGE CLASSIFICATION IMAGENET
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-network-training-using-microsoft-r-server-and-azure-gpu-vms/
Real class
Predicted class
TEXT CLASSIFICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Train
Backend
Dataset
Azure NC24 VM with 4 K80 GPUs
.R
model.params
Azure Cloud Services
.py
.js
.html
Score
Web app
API
DNN
input text
DEMO: TEXT CLASSIFICATION WEB APP
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
LEVERAGING DATA DRIVEN RESEARCH
THROUGH MICROSOFT AZURE
Dr. Miguel Fierro
Data Scientist at Microsoft
@miguelgfierro
miguel.gonzalezfierro@microsoft.com
https://miguelgfierro.com
Plymouth University | Jan 27, 2017 | Plymouth, UK

More Related Content

What's hot

Exploring Graph Use Cases with JanusGraph
Exploring Graph Use Cases with JanusGraphExploring Graph Use Cases with JanusGraph
Exploring Graph Use Cases with JanusGraph
Jason Plurad
 
Python in Data Science Work
Python in Data Science WorkPython in Data Science Work
Python in Data Science Work
Rick. Bahague
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_
Titash Mandal
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Ncib Lotfi
 
Project 03 Student Data Files
Project 03 Student Data FilesProject 03 Student Data Files
Project 03 Student Data Files
Angela Edel
 
Neu-IR 2017: welcome
Neu-IR 2017: welcomeNeu-IR 2017: welcome
Neu-IR 2017: welcome
Bhaskar Mitra
 

What's hot (6)

Exploring Graph Use Cases with JanusGraph
Exploring Graph Use Cases with JanusGraphExploring Graph Use Cases with JanusGraph
Exploring Graph Use Cases with JanusGraph
 
Python in Data Science Work
Python in Data Science WorkPython in Data Science Work
Python in Data Science Work
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Project 03 Student Data Files
Project 03 Student Data FilesProject 03 Student Data Files
Project 03 Student Data Files
 
Neu-IR 2017: welcome
Neu-IR 2017: welcomeNeu-IR 2017: welcome
Neu-IR 2017: welcome
 

Similar to Leveraging Data Driven Research Through Microsoft Azure

Python PPT
Python PPTPython PPT
Python PPT
Edureka!
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
Edureka!
 
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
Delivering Agile Data Science on Openshift  - Red Hat Summit 2019Delivering Agile Data Science on Openshift  - Red Hat Summit 2019
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
John Archer
 
Resume Yash Tanna
Resume Yash TannaResume Yash Tanna
Resume Yash Tanna
Yash Tanna
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
Gérard Dupont
 
Living Outside the Comfort Zone - Daron green florianopolis 5-7-2014
Living Outside the Comfort Zone - Daron green   florianopolis 5-7-2014Living Outside the Comfort Zone - Daron green   florianopolis 5-7-2014
Living Outside the Comfort Zone - Daron green florianopolis 5-7-2014
Microsoft Azure for Research
 
How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists
CCG
 
Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th june
Edureka!
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
Mehdi Mirakhorli
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017
Jongwook Woo
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
DataWorks Summit
 
Resume prashant
Resume prashantResume prashant
Resume prashant
Prashant Kumar
 
Kaushik shakkari internship - resume
Kaushik shakkari   internship - resumeKaushik shakkari   internship - resume
Kaushik shakkari internship - resume
Kaushik Shakkari
 
Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1
Aravindharamanan S
 
Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)
Ashok Royal
 
Resume Diego Marinho de Oliveira
Resume Diego Marinho de OliveiraResume Diego Marinho de Oliveira
Resume Diego Marinho de Oliveira
Diego Marinho de Oliveira
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Ashok Royal
 
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Webinar:  Mastering Python - An Excellent tool for Web Scraping and Data Anal...Webinar:  Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Edureka!
 
Maximising (Re)Usability of Resources using Linked Data
Maximising (Re)Usability of Resources using Linked DataMaximising (Re)Usability of Resources using Linked Data
Maximising (Re)Usability of Resources using Linked Data
Asuncion Gomez-Perez
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
Alex Liu
 

Similar to Leveraging Data Driven Research Through Microsoft Azure (20)

Python PPT
Python PPTPython PPT
Python PPT
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
 
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
Delivering Agile Data Science on Openshift  - Red Hat Summit 2019Delivering Agile Data Science on Openshift  - Red Hat Summit 2019
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
 
Resume Yash Tanna
Resume Yash TannaResume Yash Tanna
Resume Yash Tanna
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
Living Outside the Comfort Zone - Daron green florianopolis 5-7-2014
Living Outside the Comfort Zone - Daron green   florianopolis 5-7-2014Living Outside the Comfort Zone - Daron green   florianopolis 5-7-2014
Living Outside the Comfort Zone - Daron green florianopolis 5-7-2014
 
How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists
 
Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th june
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Resume prashant
Resume prashantResume prashant
Resume prashant
 
Kaushik shakkari internship - resume
Kaushik shakkari   internship - resumeKaushik shakkari   internship - resume
Kaushik shakkari internship - resume
 
Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1
 
Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)
 
Resume Diego Marinho de Oliveira
Resume Diego Marinho de OliveiraResume Diego Marinho de Oliveira
Resume Diego Marinho de Oliveira
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
 
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Webinar:  Mastering Python - An Excellent tool for Web Scraping and Data Anal...Webinar:  Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
 
Maximising (Re)Usability of Resources using Linked Data
Maximising (Re)Usability of Resources using Linked DataMaximising (Re)Usability of Resources using Linked Data
Maximising (Re)Usability of Resources using Linked Data
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 

More from Miguel González-Fierro

Los retos de la inteligencia artificial en la sociedad actual
Los retos de la inteligencia artificial en la sociedad actualLos retos de la inteligencia artificial en la sociedad actual
Los retos de la inteligencia artificial en la sociedad actual
Miguel González-Fierro
 
Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19
Miguel González-Fierro
 
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Miguel González-Fierro
 
Best practices in coding for beginners
Best practices in coding for beginnersBest practices in coding for beginners
Best practices in coding for beginners
Miguel González-Fierro
 
Distributed training of Deep Learning Models
Distributed training of Deep Learning ModelsDistributed training of Deep Learning Models
Distributed training of Deep Learning Models
Miguel González-Fierro
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...
Miguel González-Fierro
 
Deep Learning for Sales Professionals
Deep Learning for Sales ProfessionalsDeep Learning for Sales Professionals
Deep Learning for Sales Professionals
Miguel González-Fierro
 
Deep Learning for Lung Cancer Detection
Deep Learning for Lung Cancer DetectionDeep Learning for Lung Cancer Detection
Deep Learning for Lung Cancer Detection
Miguel González-Fierro
 
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Miguel González-Fierro
 
Speeding up machine-learning applications with the LightGBM library
Speeding up machine-learning applications with the LightGBM librarySpeeding up machine-learning applications with the LightGBM library
Speeding up machine-learning applications with the LightGBM library
Miguel González-Fierro
 
Empowering every person on the planet to achieve more
Empowering every person on the planet to achieve moreEmpowering every person on the planet to achieve more
Empowering every person on the planet to achieve more
Miguel González-Fierro
 
Deep Learning for NLP
Deep Learning for NLP Deep Learning for NLP
Deep Learning for NLP
Miguel González-Fierro
 

More from Miguel González-Fierro (12)

Los retos de la inteligencia artificial en la sociedad actual
Los retos de la inteligencia artificial en la sociedad actualLos retos de la inteligencia artificial en la sociedad actual
Los retos de la inteligencia artificial en la sociedad actual
 
Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19
 
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
 
Best practices in coding for beginners
Best practices in coding for beginnersBest practices in coding for beginners
Best practices in coding for beginners
 
Distributed training of Deep Learning Models
Distributed training of Deep Learning ModelsDistributed training of Deep Learning Models
Distributed training of Deep Learning Models
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...
 
Deep Learning for Sales Professionals
Deep Learning for Sales ProfessionalsDeep Learning for Sales Professionals
Deep Learning for Sales Professionals
 
Deep Learning for Lung Cancer Detection
Deep Learning for Lung Cancer DetectionDeep Learning for Lung Cancer Detection
Deep Learning for Lung Cancer Detection
 
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep Learning
 
Speeding up machine-learning applications with the LightGBM library
Speeding up machine-learning applications with the LightGBM librarySpeeding up machine-learning applications with the LightGBM library
Speeding up machine-learning applications with the LightGBM library
 
Empowering every person on the planet to achieve more
Empowering every person on the planet to achieve moreEmpowering every person on the planet to achieve more
Empowering every person on the planet to achieve more
 
Deep Learning for NLP
Deep Learning for NLP Deep Learning for NLP
Deep Learning for NLP
 

Recently uploaded

How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 

Recently uploaded (20)

How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 

Leveraging Data Driven Research Through Microsoft Azure

  • 1. LEVERAGING DATA DRIVEN RESEARCH THROUGH MICROSOFT AZURE Dr. Miguel Fierro Data Scientist at Microsoft @miguelgfierro miguel.gonzalezfierro@microsoft.com https://miguelgfierro.com Plymouth University | Jan 27, 2017 | Plymouth, UK
  • 2. AZURE FOR RESEARCH AWARD Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro azurerfp@microsoft.com Free Azure resources if awarded Areas: data science, climate, health… Ex: Alan Turing Institute got $5M
  • 3. D a t a S c i e n ce V i r t ua l M a chi ne A z u re M L S t u d io S p a r k a n d H a d o o p w i t h A z u re OUTLINE
  • 4. SPARK & HADOOP WITH AZURE
  • 5. WHAT IS HDINSIGHT Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro HDInsight Managed Service
  • 6. MANAGER GUI: AMBARI Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 7. APACHE HADOOP Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Software for storing and analysing massive amounts (~Tb) of structured and unstructured data
  • 8. APACHE SPARK Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Framework that runs large-scale data analytics applications pySpark, Spark (Scala), SparkR 100x faster than Hadoop (processing in memory)
  • 9. APACHE KAFKA Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Stream processing for real time apps Publisher & subscriber messaging system Millions of messages per second
  • 10. APACHE STORM Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Distributed framework for real-time applications ETL, continuous computation, online machine learning Million of operations per second in each node
  • 11. APACHE HBASE Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Non-relational database (NoSQL) for Big Data applications Distributed, fast tolerant and scalable Built on top of HDFS (Hadoop Distributed File System)
  • 12. APACHE HIVE Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro SQL-like language to query data in Hadoop systems Word count program
  • 13. EXAMPLE OF ARCHITECTURE Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 14. DEMO: PYSPARK APPLICATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Log analysis with PySpark Predictive analysis on food inspection with PySpark source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark- machine-learning-mllib-ipython source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark- custom-library-website-log-analysis
  • 16. WHAT IS AZURE ML STUDIO Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro GUI for Machine Learning
  • 17. DATA INPUT/OUTPUT Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 18. DATA TRANSFORMATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 19. DATA MANIPULATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 20. FEATURE SELECTION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 21. CLASSIFICATION & REGRESSION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 22. TRAINING & SCORING Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 23. PYTHON & R SCRIPTS Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 24. AUTOMATIC API Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 25. DEMO: CREDIT RISK ANOMALY DETECTION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro source: https://gallery.cortanaintelligence.com/Experiment/1219e87f8fb84e88a2e1b54256808bb3
  • 27. WHAT IS THE DSVM Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Windows: - Anaconda with python Jupyter notebooks - Microsoft R Server - Visual Studio - SQL Server - Azure SDK - Deep learning: CNTK & MXNet - Machine Learning: XGBoost Linux: - Anaconda with python Jupyter notebooks - Microsoft R Server - PyCharm - Azure SDK - Deep learning: CNTK & MXNet - Machine Learning: XGBoost, Weka
  • 28. DEEP LEARNING DSVM Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Libs: - CNTK - MXNet - TensorFlow - Keras Digit recognition Image recognitionExamples:
  • 29. NVIDIA TESLA K80 Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 30. AI LANDSCAPE: IMAGES Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro 15.4% 7.3% 6.7% 3.6% 3.1% 5.1% (human) error (%) ImageNet (image recognition competition) top-5 error AlexNet (2012) VGG (2014) Inception (2015) ResNet (2015) Inception- ResNet (2016)
  • 31. AI LANDSCAPE: SPEECH Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Microsoft Research achieves parity with human speech level source: http://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition CNN (VGG, ResNet, LACE) RNN (Bi-LSTM) Multi-GPU and multi server (1-bit Stochastic Gradient Descent)
  • 32. IMAGE CLASSIFICATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro 1. 2. 3. 4. 5. source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural- network-training-using-microsoft-r-server-and-azure-gpu-vms/
  • 33. IMAGE CLASSIFICATION IMAGENET Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-network-training-using-microsoft-r-server-and-azure-gpu-vms/ Real class Predicted class
  • 34. TEXT CLASSIFICATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Train Backend Dataset Azure NC24 VM with 4 K80 GPUs .R model.params Azure Cloud Services .py .js .html Score Web app API DNN input text
  • 35. DEMO: TEXT CLASSIFICATION WEB APP Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 36. LEVERAGING DATA DRIVEN RESEARCH THROUGH MICROSOFT AZURE Dr. Miguel Fierro Data Scientist at Microsoft @miguelgfierro miguel.gonzalezfierro@microsoft.com https://miguelgfierro.com Plymouth University | Jan 27, 2017 | Plymouth, UK