𝜌
𝐷𝑣
𝐷𝑡
= −𝛻𝑝 + 𝛻 ∙ 𝜯 + 𝒇
Data
Acquisition &
modelling
Collaboration
and
visualisation
Analysis &
data mining
Dissemination
& sharing
Archiving and
preserving
fourthparadigm.org
Data-intensive Research
X-Info
• Data ingest
• Managing a petabyte
• Common schema
• How to organize it
• How to reorganize it
• How to share with others
• Query and Vis tools
• Building and executing models
• Integrating data and Literature
• Documenting experiments
• Curation and long-term
preservation
The Generic Problems
Experiments &
Instruments
Simulations
Literature
Other Archives
facts
facts
facts
facts
Questions
Answers
Gartner: http://t.co/Co3EK1ERfN
https://www.youtube.com/watch?v=TJTSEPpFZaw
A-series
• 1-16 cores
• 0.75-112GB RAM
• 20-605 GB HDD
• Up to InfiniBand 40Gbit/s
RDMA network (MPI)
D-series
• 1-16 cores
• 3.5-112 GB RAM
• Up to 800GB SSD
G-series
• 32 cores
• 468 GB RAM
• 6.5 TB SSD
Parker MacCready: Univ. of Washington
Rob Fatland:, Wenming Ye, Nels Oscar, Microsoft Research
Modeling Workflow
Forcing Data
Processed into
Standard Format
Output
MODEL
Model-specific
Forcing Files
Raw Forcing
Data
Observations
Processed into
Standard Format
SKILL TESTRaw Observational
Data
Skill
Result
ROMS
Cluster
200 cores
1 week/year
2 TB
per model year
Standard
Post Processing
1 week
LiveOcean: Hybrid Architecture
HPC
linux 150 cores
Forecast
NetCDF files
LiveOcean
Server
• Post Processing
• Pre-make .png “views”
• Archive NetCDF files
• API for web sites
• Admin.js
• Client.js
Blob Storage:
Forecast Copy
Science User
pythonAzure Table:
Log Info
Admin
Website
Client Website
http://mappable.azurewebsites.
net/liveocean/
Rivers
USGS
Atmosphere
UW WRF
Ocean
HYCOM
James Williams, SLAC CIO
ConnectTheDots.io
“The Azure for Research programme has helped
the Marine Institute and our research partners
understand how cloud computing can be used to
advance collaborative marine research including
by making on-demand compute and advanced
analytical data services much more easily
available to virtual research teams.”
Eoin O’Grady, Information Services and Development Manager,
Marine Institute (Ireland)
British Library Labs cloud
analysis of digital catalogues,
including 19th Century books
scanned by Microsoft.
@MechCuratorBot
mechanicalcurator.tumblr.com
RaaS
SaaS
PaaS
IaaS
Cloud Services
Research collaboration and data
lifecycle services
Data management, application
services, collaboration tools.
Programming abstractions,
database support, runtime
systems
Virtual machines, reliable
storage, provisioning tools,
network bandwidth
Research
Marketplace
Analytics services and expert
consulting
Domain specific applications
and data access
Advanced development tools
and libraries to SaaS
developers
Specially configured virtual
machine templates
www.azure4research.com
Use laptops &
desktop computers
Overwhelmed by
data
Finding analysis
ever more difficult;
sharing even
harder
www.azure4research.com
Azure for Research Russia Special Awards
• 250,000 compute hours, 20TB storage,
machine learning, NoSQL and more…
• Apply by 15 Aug’15 at
http://aka.ms/azureresearchrussia
http://aka.ms/azureresearchrussia

Accelerating your research with Microsoft Azure

  • 2.
  • 3.
    Data Acquisition & modelling Collaboration and visualisation Analysis & datamining Dissemination & sharing Archiving and preserving fourthparadigm.org Data-intensive Research
  • 4.
    X-Info • Data ingest •Managing a petabyte • Common schema • How to organize it • How to reorganize it • How to share with others • Query and Vis tools • Building and executing models • Integrating data and Literature • Documenting experiments • Curation and long-term preservation The Generic Problems Experiments & Instruments Simulations Literature Other Archives facts facts facts facts Questions Answers
  • 5.
  • 13.
  • 16.
    A-series • 1-16 cores •0.75-112GB RAM • 20-605 GB HDD • Up to InfiniBand 40Gbit/s RDMA network (MPI) D-series • 1-16 cores • 3.5-112 GB RAM • Up to 800GB SSD G-series • 32 cores • 468 GB RAM • 6.5 TB SSD
  • 18.
    Parker MacCready: Univ.of Washington Rob Fatland:, Wenming Ye, Nels Oscar, Microsoft Research
  • 20.
    Modeling Workflow Forcing Data Processedinto Standard Format Output MODEL Model-specific Forcing Files Raw Forcing Data Observations Processed into Standard Format SKILL TESTRaw Observational Data Skill Result ROMS Cluster 200 cores 1 week/year 2 TB per model year Standard Post Processing 1 week
  • 21.
    LiveOcean: Hybrid Architecture HPC linux150 cores Forecast NetCDF files LiveOcean Server • Post Processing • Pre-make .png “views” • Archive NetCDF files • API for web sites • Admin.js • Client.js Blob Storage: Forecast Copy Science User pythonAzure Table: Log Info Admin Website Client Website http://mappable.azurewebsites. net/liveocean/ Rivers USGS Atmosphere UW WRF Ocean HYCOM
  • 25.
  • 26.
  • 30.
    “The Azure forResearch programme has helped the Marine Institute and our research partners understand how cloud computing can be used to advance collaborative marine research including by making on-demand compute and advanced analytical data services much more easily available to virtual research teams.” Eoin O’Grady, Information Services and Development Manager, Marine Institute (Ireland)
  • 32.
    British Library Labscloud analysis of digital catalogues, including 19th Century books scanned by Microsoft. @MechCuratorBot mechanicalcurator.tumblr.com
  • 34.
    RaaS SaaS PaaS IaaS Cloud Services Research collaborationand data lifecycle services Data management, application services, collaboration tools. Programming abstractions, database support, runtime systems Virtual machines, reliable storage, provisioning tools, network bandwidth Research Marketplace Analytics services and expert consulting Domain specific applications and data access Advanced development tools and libraries to SaaS developers Specially configured virtual machine templates
  • 35.
  • 36.
    Use laptops & desktopcomputers Overwhelmed by data Finding analysis ever more difficult; sharing even harder www.azure4research.com
  • 37.
    Azure for ResearchRussia Special Awards • 250,000 compute hours, 20TB storage, machine learning, NoSQL and more… • Apply by 15 Aug’15 at http://aka.ms/azureresearchrussia
  • 38.