In this talk I gave an overview of some of the tools that Microsoft Azure offers to researchers. I spoke about Microsoft's Big Data platform, called HDInsight, that allows for creating Spark and Hadoop applications; about Azure ML Studio, a GUI for developing machine learning models very quickly; and about the Data Science Virtual Machine (DSVM), a VM targeted to data scientists and machine learning professionals, which contains all the needed software to create any machine learning system.
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Leveraging Data Driven Research Through Microsoft Azure
1. LEVERAGING DATA DRIVEN RESEARCH
THROUGH MICROSOFT AZURE
Dr. Miguel Fierro
Data Scientist at Microsoft
@miguelgfierro
miguel.gonzalezfierro@microsoft.com
https://miguelgfierro.com
Plymouth University | Jan 27, 2017 | Plymouth, UK
2. AZURE FOR RESEARCH AWARD
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
azurerfp@microsoft.com
Free Azure resources if awarded
Areas: data science, climate, health…
Ex: Alan Turing Institute got $5M
3. D a t a S c i e n ce V i r t ua l
M a chi ne
A z u re M L S t u d io
S p a r k a n d H a d o o p
w i t h A z u re
OUTLINE
7. APACHE HADOOP
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Software for storing and analysing
massive amounts (~Tb) of
structured and unstructured data
8. APACHE SPARK
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Framework that runs large-scale data analytics applications
pySpark, Spark (Scala), SparkR
100x faster than Hadoop (processing in memory)
9. APACHE KAFKA
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Stream processing for real time apps
Publisher & subscriber messaging system
Millions of messages per second
10. APACHE STORM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Distributed framework for real-time applications
ETL, continuous computation, online machine learning
Million of operations per second in each node
11. APACHE HBASE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Non-relational database (NoSQL) for Big Data applications
Distributed, fast tolerant and scalable
Built on top of HDFS (Hadoop Distributed File System)
12. APACHE HIVE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
SQL-like language to query data in Hadoop systems
Word count program
14. DEMO: PYSPARK APPLICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Log analysis with PySpark Predictive analysis on food inspection with PySpark
source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-
machine-learning-mllib-ipython
source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-
custom-library-website-log-analysis
25. DEMO: CREDIT RISK ANOMALY DETECTION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
source: https://gallery.cortanaintelligence.com/Experiment/1219e87f8fb84e88a2e1b54256808bb3
27. WHAT IS THE DSVM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Windows:
- Anaconda with python Jupyter notebooks
- Microsoft R Server
- Visual Studio
- SQL Server
- Azure SDK
- Deep learning: CNTK & MXNet
- Machine Learning: XGBoost
Linux:
- Anaconda with python Jupyter notebooks
- Microsoft R Server
- PyCharm
- Azure SDK
- Deep learning: CNTK & MXNet
- Machine Learning: XGBoost, Weka
28. DEEP LEARNING DSVM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Libs:
- CNTK
- MXNet
- TensorFlow
- Keras
Digit recognition Image recognitionExamples:
30. AI LANDSCAPE: IMAGES
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
15.4%
7.3%
6.7%
3.6%
3.1%
5.1% (human)
error (%)
ImageNet (image recognition competition) top-5 error
AlexNet
(2012)
VGG
(2014)
Inception
(2015)
ResNet
(2015)
Inception-
ResNet
(2016)
31. AI LANDSCAPE: SPEECH
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Microsoft Research achieves parity with
human speech level
source: http://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition
CNN
(VGG, ResNet, LACE)
RNN
(Bi-LSTM)
Multi-GPU and multi server
(1-bit Stochastic Gradient Descent)
32. IMAGE CLASSIFICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
1.
2.
3.
4.
5.
source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-
network-training-using-microsoft-r-server-and-azure-gpu-vms/
33. IMAGE CLASSIFICATION IMAGENET
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-network-training-using-microsoft-r-server-and-azure-gpu-vms/
Real class
Predicted class
34. TEXT CLASSIFICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Train
Backend
Dataset
Azure NC24 VM with 4 K80 GPUs
.R
model.params
Azure Cloud Services
.py
.js
.html
Score
Web app
API
DNN
input text
35. DEMO: TEXT CLASSIFICATION WEB APP
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
36. LEVERAGING DATA DRIVEN RESEARCH
THROUGH MICROSOFT AZURE
Dr. Miguel Fierro
Data Scientist at Microsoft
@miguelgfierro
miguel.gonzalezfierro@microsoft.com
https://miguelgfierro.com
Plymouth University | Jan 27, 2017 | Plymouth, UK