SlideShare a Scribd company logo
1 of 36
Download to read offline
LEVERAGING DATA DRIVEN RESEARCH
THROUGH MICROSOFT AZURE
Dr. Miguel Fierro
Data Scientist at Microsoft
@miguelgfierro
miguel.gonzalezfierro@microsoft.com
https://miguelgfierro.com
Plymouth University | Jan 27, 2017 | Plymouth, UK
AZURE FOR RESEARCH AWARD
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
azurerfp@microsoft.com
Free Azure resources if awarded
Areas: data science, climate, health…
Ex: Alan Turing Institute got $5M
D a t a S c i e n ce V i r t ua l
M a chi ne
A z u re M L S t u d io
S p a r k a n d H a d o o p
w i t h A z u re
OUTLINE
SPARK & HADOOP WITH AZURE
WHAT IS HDINSIGHT
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
HDInsight
Managed Service
MANAGER GUI: AMBARI
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
APACHE HADOOP
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Software for storing and analysing
massive amounts (~Tb) of
structured and unstructured data
APACHE SPARK
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Framework that runs large-scale data analytics applications
pySpark, Spark (Scala), SparkR
100x faster than Hadoop (processing in memory)
APACHE KAFKA
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Stream processing for real time apps
Publisher & subscriber messaging system
Millions of messages per second
APACHE STORM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Distributed framework for real-time applications
ETL, continuous computation, online machine learning
Million of operations per second in each node
APACHE HBASE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Non-relational database (NoSQL) for Big Data applications
Distributed, fast tolerant and scalable
Built on top of HDFS (Hadoop Distributed File System)
APACHE HIVE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
SQL-like language to query data in Hadoop systems
Word count program
EXAMPLE OF ARCHITECTURE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
DEMO: PYSPARK APPLICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Log analysis with PySpark Predictive analysis on food inspection with PySpark
source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-
machine-learning-mllib-ipython
source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-
custom-library-website-log-analysis
AZURE ML STUDIO
WHAT IS AZURE ML STUDIO
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
GUI for Machine Learning
DATA INPUT/OUTPUT
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
DATA TRANSFORMATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
DATA MANIPULATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
FEATURE SELECTION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
CLASSIFICATION & REGRESSION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
TRAINING & SCORING
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
PYTHON & R SCRIPTS
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
AUTOMATIC API
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
DEMO: CREDIT RISK ANOMALY DETECTION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
source: https://gallery.cortanaintelligence.com/Experiment/1219e87f8fb84e88a2e1b54256808bb3
DATA SCIENCE VIRTUAL MACHINE
WHAT IS THE DSVM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Windows:
- Anaconda with python Jupyter notebooks
- Microsoft R Server
- Visual Studio
- SQL Server
- Azure SDK
- Deep learning: CNTK & MXNet
- Machine Learning: XGBoost
Linux:
- Anaconda with python Jupyter notebooks
- Microsoft R Server
- PyCharm
- Azure SDK
- Deep learning: CNTK & MXNet
- Machine Learning: XGBoost, Weka
DEEP LEARNING DSVM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Libs:
- CNTK
- MXNet
- TensorFlow
- Keras
Digit recognition Image recognitionExamples:
NVIDIA TESLA K80
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
AI LANDSCAPE: IMAGES
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
15.4%
7.3%
6.7%
3.6%
3.1%
5.1% (human)
error (%)
ImageNet (image recognition competition) top-5 error
AlexNet
(2012)
VGG
(2014)
Inception
(2015)
ResNet
(2015)
Inception-
ResNet
(2016)
AI LANDSCAPE: SPEECH
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Microsoft Research achieves parity with
human speech level
source: http://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition
CNN
(VGG, ResNet, LACE)
RNN
(Bi-LSTM)
Multi-GPU and multi server
(1-bit Stochastic Gradient Descent)
IMAGE CLASSIFICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
1.
2.
3.
4.
5.
source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-
network-training-using-microsoft-r-server-and-azure-gpu-vms/
IMAGE CLASSIFICATION IMAGENET
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-network-training-using-microsoft-r-server-and-azure-gpu-vms/
Real class
Predicted class
TEXT CLASSIFICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Train
Backend
Dataset
Azure NC24 VM with 4 K80 GPUs
.R
model.params
Azure Cloud Services
.py
.js
.html
Score
Web app
API
DNN
input text
DEMO: TEXT CLASSIFICATION WEB APP
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
LEVERAGING DATA DRIVEN RESEARCH
THROUGH MICROSOFT AZURE
Dr. Miguel Fierro
Data Scientist at Microsoft
@miguelgfierro
miguel.gonzalezfierro@microsoft.com
https://miguelgfierro.com
Plymouth University | Jan 27, 2017 | Plymouth, UK

More Related Content

What's hot

Exploring Graph Use Cases with JanusGraph
Exploring Graph Use Cases with JanusGraphExploring Graph Use Cases with JanusGraph
Exploring Graph Use Cases with JanusGraphJason Plurad
 
Python in Data Science Work
Python in Data Science WorkPython in Data Science Work
Python in Data Science WorkRick. Bahague
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_Titash Mandal
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceNcib Lotfi
 
Project 03 Student Data Files
Project 03 Student Data FilesProject 03 Student Data Files
Project 03 Student Data FilesAngela Edel
 
Neu-IR 2017: welcome
Neu-IR 2017: welcomeNeu-IR 2017: welcome
Neu-IR 2017: welcomeBhaskar Mitra
 

What's hot (6)

Exploring Graph Use Cases with JanusGraph
Exploring Graph Use Cases with JanusGraphExploring Graph Use Cases with JanusGraph
Exploring Graph Use Cases with JanusGraph
 
Python in Data Science Work
Python in Data Science WorkPython in Data Science Work
Python in Data Science Work
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Project 03 Student Data Files
Project 03 Student Data FilesProject 03 Student Data Files
Project 03 Student Data Files
 
Neu-IR 2017: welcome
Neu-IR 2017: welcomeNeu-IR 2017: welcome
Neu-IR 2017: welcome
 

Similar to Leveraging Data Driven Research Through Microsoft Azure

Python PPT
Python PPTPython PPT
Python PPTEdureka!
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data AnalyticsEdureka!
 
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
Delivering Agile Data Science on Openshift  - Red Hat Summit 2019Delivering Agile Data Science on Openshift  - Red Hat Summit 2019
Delivering Agile Data Science on Openshift - Red Hat Summit 2019John Archer
 
Resume Yash Tanna
Resume Yash TannaResume Yash Tanna
Resume Yash TannaYash Tanna
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021Gérard Dupont
 
Living Outside the Comfort Zone - Daron green florianopolis 5-7-2014
Living Outside the Comfort Zone - Daron green   florianopolis 5-7-2014Living Outside the Comfort Zone - Daron green   florianopolis 5-7-2014
Living Outside the Comfort Zone - Daron green florianopolis 5-7-2014Microsoft Azure for Research
 
How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists CCG
 
Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th juneEdureka!
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringMehdi Mirakhorli
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017Jongwook Woo
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data ScienceDataWorks Summit
 
Kaushik shakkari internship - resume
Kaushik shakkari   internship - resumeKaushik shakkari   internship - resume
Kaushik shakkari internship - resumeKaushik Shakkari
 
Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1Aravindharamanan S
 
Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)Ashok Royal
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Ashok Royal
 
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Webinar:  Mastering Python - An Excellent tool for Web Scraping and Data Anal...Webinar:  Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...Edureka!
 
Maximising (Re)Usability of Resources using Linked Data
Maximising (Re)Usability of Resources using Linked DataMaximising (Re)Usability of Resources using Linked Data
Maximising (Re)Usability of Resources using Linked DataAsuncion Gomez-Perez
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 

Similar to Leveraging Data Driven Research Through Microsoft Azure (20)

Python PPT
Python PPTPython PPT
Python PPT
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
 
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
Delivering Agile Data Science on Openshift  - Red Hat Summit 2019Delivering Agile Data Science on Openshift  - Red Hat Summit 2019
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
 
Resume Yash Tanna
Resume Yash TannaResume Yash Tanna
Resume Yash Tanna
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
Living Outside the Comfort Zone - Daron green florianopolis 5-7-2014
Living Outside the Comfort Zone - Daron green   florianopolis 5-7-2014Living Outside the Comfort Zone - Daron green   florianopolis 5-7-2014
Living Outside the Comfort Zone - Daron green florianopolis 5-7-2014
 
How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists
 
Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th june
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Resume prashant
Resume prashantResume prashant
Resume prashant
 
Kaushik shakkari internship - resume
Kaushik shakkari   internship - resumeKaushik shakkari   internship - resume
Kaushik shakkari internship - resume
 
Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1
 
Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)
 
Resume Diego Marinho de Oliveira
Resume Diego Marinho de OliveiraResume Diego Marinho de Oliveira
Resume Diego Marinho de Oliveira
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
 
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Webinar:  Mastering Python - An Excellent tool for Web Scraping and Data Anal...Webinar:  Mastering Python - An Excellent tool for Web Scraping and Data Anal...
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...
 
Maximising (Re)Usability of Resources using Linked Data
Maximising (Re)Usability of Resources using Linked DataMaximising (Re)Usability of Resources using Linked Data
Maximising (Re)Usability of Resources using Linked Data
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 

More from Miguel González-Fierro

Los retos de la inteligencia artificial en la sociedad actual
Los retos de la inteligencia artificial en la sociedad actualLos retos de la inteligencia artificial en la sociedad actual
Los retos de la inteligencia artificial en la sociedad actualMiguel González-Fierro
 
Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19Miguel González-Fierro
 
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...Miguel González-Fierro
 
Distributed training of Deep Learning Models
Distributed training of Deep Learning ModelsDistributed training of Deep Learning Models
Distributed training of Deep Learning ModelsMiguel González-Fierro
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Miguel González-Fierro
 
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMiguel González-Fierro
 
Speeding up machine-learning applications with the LightGBM library
Speeding up machine-learning applications with the LightGBM librarySpeeding up machine-learning applications with the LightGBM library
Speeding up machine-learning applications with the LightGBM libraryMiguel González-Fierro
 
Empowering every person on the planet to achieve more
Empowering every person on the planet to achieve moreEmpowering every person on the planet to achieve more
Empowering every person on the planet to achieve moreMiguel González-Fierro
 

More from Miguel González-Fierro (12)

Los retos de la inteligencia artificial en la sociedad actual
Los retos de la inteligencia artificial en la sociedad actualLos retos de la inteligencia artificial en la sociedad actual
Los retos de la inteligencia artificial en la sociedad actual
 
Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19
 
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
Thesis dissertation: Humanoid Robot Control of Complex Postural Tasks based o...
 
Best practices in coding for beginners
Best practices in coding for beginnersBest practices in coding for beginners
Best practices in coding for beginners
 
Distributed training of Deep Learning Models
Distributed training of Deep Learning ModelsDistributed training of Deep Learning Models
Distributed training of Deep Learning Models
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...
 
Deep Learning for Sales Professionals
Deep Learning for Sales ProfessionalsDeep Learning for Sales Professionals
Deep Learning for Sales Professionals
 
Deep Learning for Lung Cancer Detection
Deep Learning for Lung Cancer DetectionDeep Learning for Lung Cancer Detection
Deep Learning for Lung Cancer Detection
 
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep Learning
 
Speeding up machine-learning applications with the LightGBM library
Speeding up machine-learning applications with the LightGBM librarySpeeding up machine-learning applications with the LightGBM library
Speeding up machine-learning applications with the LightGBM library
 
Empowering every person on the planet to achieve more
Empowering every person on the planet to achieve moreEmpowering every person on the planet to achieve more
Empowering every person on the planet to achieve more
 
Deep Learning for NLP
Deep Learning for NLP Deep Learning for NLP
Deep Learning for NLP
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Leveraging Data Driven Research Through Microsoft Azure

  • 1. LEVERAGING DATA DRIVEN RESEARCH THROUGH MICROSOFT AZURE Dr. Miguel Fierro Data Scientist at Microsoft @miguelgfierro miguel.gonzalezfierro@microsoft.com https://miguelgfierro.com Plymouth University | Jan 27, 2017 | Plymouth, UK
  • 2. AZURE FOR RESEARCH AWARD Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro azurerfp@microsoft.com Free Azure resources if awarded Areas: data science, climate, health… Ex: Alan Turing Institute got $5M
  • 3. D a t a S c i e n ce V i r t ua l M a chi ne A z u re M L S t u d io S p a r k a n d H a d o o p w i t h A z u re OUTLINE
  • 4. SPARK & HADOOP WITH AZURE
  • 5. WHAT IS HDINSIGHT Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro HDInsight Managed Service
  • 6. MANAGER GUI: AMBARI Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 7. APACHE HADOOP Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Software for storing and analysing massive amounts (~Tb) of structured and unstructured data
  • 8. APACHE SPARK Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Framework that runs large-scale data analytics applications pySpark, Spark (Scala), SparkR 100x faster than Hadoop (processing in memory)
  • 9. APACHE KAFKA Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Stream processing for real time apps Publisher & subscriber messaging system Millions of messages per second
  • 10. APACHE STORM Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Distributed framework for real-time applications ETL, continuous computation, online machine learning Million of operations per second in each node
  • 11. APACHE HBASE Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Non-relational database (NoSQL) for Big Data applications Distributed, fast tolerant and scalable Built on top of HDFS (Hadoop Distributed File System)
  • 12. APACHE HIVE Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro SQL-like language to query data in Hadoop systems Word count program
  • 13. EXAMPLE OF ARCHITECTURE Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 14. DEMO: PYSPARK APPLICATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Log analysis with PySpark Predictive analysis on food inspection with PySpark source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark- machine-learning-mllib-ipython source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark- custom-library-website-log-analysis
  • 16. WHAT IS AZURE ML STUDIO Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro GUI for Machine Learning
  • 17. DATA INPUT/OUTPUT Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 18. DATA TRANSFORMATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 19. DATA MANIPULATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 20. FEATURE SELECTION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 21. CLASSIFICATION & REGRESSION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 22. TRAINING & SCORING Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 23. PYTHON & R SCRIPTS Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 24. AUTOMATIC API Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 25. DEMO: CREDIT RISK ANOMALY DETECTION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro source: https://gallery.cortanaintelligence.com/Experiment/1219e87f8fb84e88a2e1b54256808bb3
  • 27. WHAT IS THE DSVM Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Windows: - Anaconda with python Jupyter notebooks - Microsoft R Server - Visual Studio - SQL Server - Azure SDK - Deep learning: CNTK & MXNet - Machine Learning: XGBoost Linux: - Anaconda with python Jupyter notebooks - Microsoft R Server - PyCharm - Azure SDK - Deep learning: CNTK & MXNet - Machine Learning: XGBoost, Weka
  • 28. DEEP LEARNING DSVM Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Libs: - CNTK - MXNet - TensorFlow - Keras Digit recognition Image recognitionExamples:
  • 29. NVIDIA TESLA K80 Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 30. AI LANDSCAPE: IMAGES Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro 15.4% 7.3% 6.7% 3.6% 3.1% 5.1% (human) error (%) ImageNet (image recognition competition) top-5 error AlexNet (2012) VGG (2014) Inception (2015) ResNet (2015) Inception- ResNet (2016)
  • 31. AI LANDSCAPE: SPEECH Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Microsoft Research achieves parity with human speech level source: http://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition CNN (VGG, ResNet, LACE) RNN (Bi-LSTM) Multi-GPU and multi server (1-bit Stochastic Gradient Descent)
  • 32. IMAGE CLASSIFICATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro 1. 2. 3. 4. 5. source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural- network-training-using-microsoft-r-server-and-azure-gpu-vms/
  • 33. IMAGE CLASSIFICATION IMAGENET Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-network-training-using-microsoft-r-server-and-azure-gpu-vms/ Real class Predicted class
  • 34. TEXT CLASSIFICATION Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro Train Backend Dataset Azure NC24 VM with 4 K80 GPUs .R model.params Azure Cloud Services .py .js .html Score Web app API DNN input text
  • 35. DEMO: TEXT CLASSIFICATION WEB APP Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
  • 36. LEVERAGING DATA DRIVEN RESEARCH THROUGH MICROSOFT AZURE Dr. Miguel Fierro Data Scientist at Microsoft @miguelgfierro miguel.gonzalezfierro@microsoft.com https://miguelgfierro.com Plymouth University | Jan 27, 2017 | Plymouth, UK