SlideShare a Scribd company logo
Andrea Ferretti
Handling data and workflows in
computational materials science
The AiiDA initiative
Firenze, 15 Nov 2016
-  Highly accurate ab initio
methods in electronic structure
-  Large computational power
required (now available)
-  High-throughput screening
possible
-  Reduced need for exp dat
COMPUTATIONAL MATERIALS’ SCIENCE
N. Marzari, Nature Materials, Apr 2016
PRL 105, 106601 (2010)
COMPUTATIONAL MATERIALS’ SCIENCE
G. Hautier et al, Nat Comm 4, 2292 (2013)
p-type dopability have already been reported experimentally
or computationally for several of them. B6O has been
experimentally measured to show p-type conductivity31. It has
been demonstrated experimentally that PbZr0.5Ti0.5O3 can be
K2Pb2O
4
3
2
Defectformationenergy(eV)
1
0
–1
–2
–3
–4
–2
–3
c
1–
Defec
Figure 3 | Vacan
indicate results fo
vacancy formatio
are indicated by o
1.5 3 3.5 4 4.5 5
0
0.5
1
1.5
2
2.5
3
3.5
ZnO SnO2
In2O3
AlCuO2
SrCu2
O2
ZnRh2O4
K2Sn2O3
Sb4
Cl2
O5
K2
Pb2
O3 PbTiO3
Ca4
As2
O
Ca4
P2
O
Sr4
P2
O
Sr4As2O
Hg2
SO4
PbZrO3NaNbO2
Tl4
V2
O7
Tl4
O3
ZrSO HfSO
B6
O
Na2
Sn2
O3
PbHfO3
Band gap (eV)
Effectivemass Current p-type
TCOs
Current n-type TCOs
2 2.5
Figure 2 | Effective mass versus band gap for the p-type TCO candidates.
We superposed on the band gap axis a colour spectrum corresponding to
the wavelength associated with a photon energy. The TCO candidates are
marked with red dots. A few known p-type (blue diamonds) and n-type
(green square) TCOs can be compared to the new candidates. The best
TCOs should lie in the lower right corner. For clarity, we kept only one
representative when polymorphs existed for a given stoechiometry (for
-  Highly accurate ab initio
methods in electronic structure
-  Large computational power
required (now available)
-  High-throughput screening
possible
-  Reduced need for exp data
-  Data handling needed
COMPUTATIONAL MATERIALS’ SCIENCE
N. Marzari, Nature Materials, Apr 2016
PRL 105, 106601 (2010)
SOME THOUGHTS ON DATA
•  In computational science, data are naturally generated,
so the workflows that create properties and data from a
structure are key
•  Curated data are needed (e.g. for verification or for
machine learning)
•  A model of data-on-demand can be implemented
(high-throughput pushes the development of robust
workflows to calculate automatically).
OBJECTIVES
•  Automation: run thousands of calculations daily
•  Provenance: all children and all parent data are
recorded
•  Reproducibility: go back to a simulation years later,
and redo it with new parameters or codes
•  Extensible/agnostic to models, codes and formats
•  Workflows: dynamical, robust, complex “turnkey
solutions” that calculate desired properties on demand
•  Sharing: provide the distributed environment to
disseminate workflows and data and to provide
services
ADES MODEL FOR COMPUTATIONAL
SCIENCE
G. Pizzi et al., Comp. Mat. Sci 111, 218-230 (2016)
Low-level pillars User-level pillars
ECOSYSTEM
Automation Data Environment Sharing
Automation Database Research environment Social
Remote management Provenance Scientific workflows Sharing
High-throughput Storage Data analytics Standards
A factory A library A scholar A community
http://www.aiida.net
(MIT BSD, jointly developed with Robert Bosch)
G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016)
G. Pizzi, A.C., et al., arXiv:1504.01163
ADES
Automation in AiiDA
Remote management
Coupling to data
High throughput
Automation in AiiDA
1. The core of the code is the AiiDA API (Application Programming
Interface), a set of Python classes that exposes the users to the key
objects: Calculations, Codes, and Data.
What is AiiDA?
Automation in AiiDA
2. The AiiDA Object-Relational Mapper (ORM) maps AiiDA objects into
Python Classes, so that the objects can be created/modified/queried via
an agnostic high-level interface. Any interaction with Storage occurs
transparently via Python calls.
Automation in AiiDA
3. A daemon manages calculation states (submission, retrieval,
parsing…) without user intervention (uses Python celery+supervisor
modules), through remote transports and Slurm/PBS Pro/SGE/
Torque plugins.
Automation in AiiDA
4. User interactions occurs via the command line tool Verdi, the
interactive shell or via Python scripts
Coupling automation with storage
•  The AiiDA-API acts as the unique interface to
heterogeneous, remote HPC resources, that are
abstracted away
– All work can be done on the local resources, and the user
does not need to connect explicitly to remote HPC
•  Coupling automation with storage ensures:
– uniformity of the input data, usage of codes and computers
(the same interface encompasses several supercomputers,
different schedulers, connection protocols…
– full reproducibility and provenance, with automatic storage of
all data and links
– seamless sharing of calculations with other users
G. Pizzi, A.C., et al., arXiv:1504.01163
ADES
Data in AiiDA
Storage
Database
Provenance
The Open Provenance Model
•  Any calculation is a function,
manipulating an input to obtain an
output:
out1, out2 = F(in1, in2)
•  Each functional object is a node in a
graph, connected together with
directional, labeled links
•  Output nodes in turn can be used as
inputs of following calculations out1 out2
in1 in2
F
data
data
data
data
calc
DIRECTED ACYCLIC GRAPHS
Nodes:
Calculations
Codes
Data
Saving the DAGs: Nodes and Links
Nodes and links:
a graph structure
•  Each node: row in a SQL table
+ folder for files
•  Links also stored in a SQL table
jobs provenance
Transitive closure (TC) table
•  Allows queries that traverse the graph
•  Automatically updated using triggers
•  Queries using TC in SQL faster than with
graph DB backends!
Benchmark against Neo4j
•  Graph databases exist (Neo4j)
•  They are still young, while SQL is very mature
•  Our benchmark (with postgreSQL) vs. Neo4j on same realistic
data, ~11K graphs, ~100K nodes, >1M attributes)
AiiDA (query 1 and 2)
Neo4j (query 1)
Neo4j (query 2)
Number of results
Query
time (s)
The AiiDA daemon
A daemon runs in the background
Calculation state
SUBMITTING
WITHSCHEDULER
RETRIEVING
PARSING
FINISHED
G. Pizzi, A.C., et al., arXiv:1504.01163
ADES
Environment in AiiDA
High-level workspace
Scientific workflows
Data analytics
Environment in AiiDA: plugins
All functionality provided using a plugin interface
Calculation Data Parser Transport Scheduler
Generation of
input files for a
given code
Quantum Espresso,
Phonopy, GPAW,
Yambo, NWChem,
…
Management of
data objects for
input/output
files&folders,
parameter sets,
remote data,
structures,
pseudos, ...
Parsing of code
output and
generation of
new DB nodes
Quantum Espresso,
Phonopy, GPAW,
Yambo,
NWChem, ...
How to connect
to a cluster
local connection,
ssh, ...
How to interact
with the
scheduler
PBSPro, Torque,
SGE, SLURM, ...
•  Full python scripting capabilities
•  AiiDA manages calculation dependency
•  They are modular: users can expand on the workflows of others
•  A step can call nested subworkflows.
•  Develop turn-key solutions for the calculation of material
properties: libraries of workflows
Environment in AiiDA: Workflows
Workflows features
•  Automatic provenance tracking, stored in DB using simple
python functions
inputs, outputs, function calls stored by adding simple decorator to existing functions
•  Serial and parallel execution support
can launch long running tasks on separate threads and wait for result when needed
•  Control provenance granularity
store level of detail relevant to the workflows
•  Seamless mixing of local and remote jobs
•  Progress checkpointing
restart from arbitrary step, retry on failure
•  Easy debugging
execute workflows in IDE and observe/change states of variables as it runs
•  Background execution
daemon execution allows machine to be shutdown and continue from last point,
essential for running long remote jobs
WORKFLOWS ENCODING CORE KNOWLEDGE
CHRONOS workflow:
electronic-magnetic-
atomic structure
PHONON workflow:
phonon dispersions
(+elastic, dielectric)
Single q
calculation
Single q
calculation
Phonon
initialization
Energy
calculation
Input
parameters
Dynamical matrices
Phonon calculation
Phonon calculation
Single q
calculation
Collect results
Fourier
interpolation
Phonon
dispersion
q-points distribution
Loops on itself
if fails (change
parameters)
Restart if clean stop (max
CPU time reached)
Phonon “restart”
sub-workflow Testing metallic
character
Generating structures with
random magnetizations
Structure
Magnetic
energy relax.
Fully relaxed
structure
Magnetic
energy relax.
Magnetic
energy relax.
Lowest energy
configuration
Non-magnetic
energy relaxation
Final energy
relaxation + bands
Electronic
bands
Energy calculation +
bands
Finding
magnetic
properties
Set of tested &
converged
pseudos (SSSP)
InlineCalculation (4825159)
elastic_constants_inline()
ParameterData (4825161)
output_parameters
StructureData (4781156)
InSe
structure
ure
InlineCalculation (47811
deformation_inline()
structure
ParameterData (4825155)
bestfit_1
ParameterData (4825151)
bestfit_0
InlineCalculation (4781155)
standardize_structure_inline()
standardized_structure
InlineCalculation (4825150)
best_fit_inline()
output_parameters
ParameterData (4781154)
parameters
StructureData (288318)
'3D_with_2D_substructure'
InSe
structure
ParameterData (4795936)
lagrangian_strain_9
lagrangian_strain_10
ParameterData (4825149)
parameters
ParameterData (4815214)
lagrangian_strain_9
ParameterData (4799987)
lagrangian_strain_0 ParameterData (4795620)
lagrangian_strain_1
ParameterData (4795368)
lagrangian_strain_2
ParameterData (4795979)
lagrangian_strain_3
ParameterData (4796167)
lagrangian_strain_4 ParameterData (4803103)
lagrangian_strain_5
ParameterData (4804527)
lagrangian_strain_7
PwCalculation (269210)
vc-relax FINISHED
output_structure
PwCalculation (4808739)
relax FINISHED
output_parameters
output_parameters
output_parameters
output_parameters
PwCalculation (4793538)
relax FINISHED
output_parameters
PwCalculation (4793532)
relax FINISHED
output_parameters
PwCalculation (4793544)
relax FINISHED
output_parameters
output_parameters
Code (124499)
'pw-5.2-rhoxml-piz-dora_aprun'
code
PwCalculation (273242)
vc-relax FINISHED
code
SinglefileData (260128)
vdw_table
vdw_tablele vdw_table vdw_table vdw_tablevdw_table vdw_table vdw_table vdw_tablevdw_table vdw_table vdw_table
vdw_table
vdw_table
vdw_table
vdw_table
ParameterData (281907)
parameters
ParameterData (281906)
settings
KpointsData (246769)
10x10x2 (+0.0,0.0,0.0)
kpoints
kpoints kpoints kpoints kpointskpointskpoints kpoints kpoints kpoints kpoints kpoints kpointskpoints kpoints kpoints
kpoints
UpfData (81898)
pseudo_In
pseudo_In pseudo_Inpseudo_In pseudo_In pseudo_Inpseudo_Inpseudo_In pseudo_In pseudo_In pseudo_Ipseudo_In pseudo_In
pseudo_In
pseudo_Se
pseudo_Se pseudo_Sepseudo_Se pseudo_Se pseudo_Sepseudo_Se pseudo_Se
pseudo_Se
pseudo_Se
StructureData (272128)
'3D_with_2D_substructure'
InSe
structure
Code (4634612)
'pw-5.2-rhoxml-piz-daint'
code code code code codecode code code
ParameterData (4808737)
parameters
ParameterData (4808738)
settings
StructureData (4781175)
InSe
structure
ParameterData (4793536)
parameters
ParameterData (4793537)
settings
StructureData (4781162)
InSe
structure
ParameterData (4793530)
parameters
ParameterData (4793531)
settings
StructureData (4781163)
InSe
structure
ParameterData (4793542)
parameters
ParameterD
settings
StructureData (4781164)
InSe
structure
output_structure
deformed_structure_8deformed_structure_3 deformed_structure_2 deformed_structure_1
ParameterData (270684)
parameters
ParameterData (270683)
settings
StructureData (34978)
'3D_with_2D_substructure'
InSe
structure
ParameterData (4781157)
parameters
InlineCalculation (34977)
primitive_structure_inline()
primitive_structure_spg
CifData (15308)
cif
ParameterData (45492)
parameters
CiffilterCalculation (37378)
FINISHED
cif
Code (33048)
'cif_select'
code
CifData (3743)
cif
ParameterData (37377)
parameters
CiffilterCalculation (38391)
FINISHED
cif
Code (24766)
'cif_filter'
code
CifData (13415)
cif
ParameterData (38393)
parameters
InlineCalculation (4825159)
elastic_constants_inline()
ParameterData (4825161)
output_parameters
StructureData (4781156)
InSe
structure
PwCalculation (4795040)
relax FINISHED
structure
InlineCalculation (4781184)
deformation_inline()
structure
InlineCalculation (4781158)
deformation_inline()
structure
ParameterData (4825155)
bestfit_1
ParameterData (4825151)
bestfit_0
InlineCalculation (4781155)
standardize_structure_inline()
standardized_structure
InlineCalculation (4825154)
best_fit_inline()
output_parameters
InlineCalculation (4825150)
best_fit_inline()
output_parameters
ParameterData (4781154)
parameters
StructureData (288318)
'3D_with_2D_substructure'
InSe
structure
ParameterData (4825153)
parameters
ParameterData (4803694)
lagrangian_strain_8
ParameterData (4795936)
lagrangian_strain_9
lagrangian_strain_10
ParameterData (4803201)
lagrangian_strain_0
ParameterData (4795358)
lagrangian_strain_1
ParameterData (4795328)
lagrangian_strain_2
ParameterData (4795562)
lagrangian_strain_3
ParameterData (4803536)
lagrangian_strain_4
ParameterData (4804359)
lagrangian_strain_5
ParameterData (4803560)
lagrangian_strain_6
ParameterData (4804374)
lagrangian_strain_7
ParameterData (4825149)
parameters
ParameterData (4815189)
lagrangian_strain_8ParameterData (4815214)
lagrangian_strain_9
ParameterData (4799987)
lagrangian_strain_0 ParameterData (4795620)
lagrangian_strain_1
ParameterData (4795368)
lagrangian_strain_2
ParameterData (4795979)
lagrangian_strain_3
ParameterData (4796167)
lagrangian_strain_4 ParameterData (4803103)
lagrangian_strain_5
ParameterData (4812455)
lagrangian_strain_6
ParameterData (4804527)
lagrangian_strain_7
PwCalculation (269210)
vc-relax FINISHED
output_structure
PwCalculation (4794607)
relax FINISHED
output_parameters output_parameters
PwCalculation (4794531)
relax FINISHED
output_parameters
PwCalculation (4793571)
relax FINISHED
output_parameters
PwCalculation (4793579)
relax FINISHED
output_parameters
PwCalculation (4793592)
relax FINISHED
output_parameters
PwCalculation (4794569)
relax FINISHED
output_parameters
PwCalculation (4794557)
relax FINISHED
output_parameters
PwCalculation (4793585)
relax FINISHED
output_parameters
PwCalculation (4794595)
relax FINISHED
output_parameters
PwCalculation (4808733)
relax FINISHED
output_parameters
PwCalculation (4808739)
relax FINISHED
output_parameters
PwCalculation (4793714)
relax FINISHED
output_parameters
PwCalculation (4793514)
relax FINISHED
output_parameters
PwCalculation (4793529)
relax FINISHED
output_parameters
PwCalculation (4793538)
relax FINISHED
output_parameters
PwCalculation (4793532)
relax FINISHED
output_parameters
PwCalculation (4793544)
relax FINISHED
output_parameters
PwCalculation (4808327)
relax FINISHED
output_parameters
PwCalculation (4794524)
relax FINISHED
output_parameters
Code (124499)
'pw-5.2-rhoxml-piz-dora_aprun'
code
PwCalculation (273242)
vc-relax FINISHED
code
SinglefileData (260128)
vdw_table
vdw_table vdw_tablevdw_tablevdw_table vdw_table vdw_tablevdw_tablevdw_table vdw_table vdw_table vdw_tablevdw_table vdw_table vdw_table vdw_tablevdw_table vdw_table vdw_table
vdw_table
vdw_table
vdw_table
PwCalculation (4793553)
relax FAILED
vdw_table
ParameterData (281907)
parameters
ParameterData (281906)
settings
KpointsData (246769)
10x10x2 (+0.0,0.0,0.0)
kpoints
kpoints kpointskpointskpoints kpoints kpointskpointskpoints kpoints kpoints kpointskpoints kpoints kpoints kpointskpoints kpoints kpoints
kpoints
kpoints
kpoints
kpoints
UpfData (81898)
pseudo_In
pseudo_In pseudo_Inpseudo_Inpseudo_In pseudo_In pseudo_Inpseudo_Inpseudo_In pseudo_In pseudo_In pseudo_Inpseudo_In pseudo_In pseudo_In pseudo_Inpseudo_In pseudo_In pseudo_In
pseudo_In
pseudo_In
pseudo_In
pseudo_In
UpfData (95553)
pseudo_Se
pseudo_Se pseudo_Sepseudo_Sepseudo_Se pseudo_Se pseudo_Sepseudo_Sepseudo_Se pseudo_Se pseudo_Se pseudo_Sepseudo_Se pseudo_Se pseudo_Se pseudo_Sepseudo_Se pseudo_Se pseudo_Se
pseudo_Se
pseudo_Se
pseudo_Se
pseudo_Se
StructureData (272128)
'3D_with_2D_substructure'
InSe
structure
Code (4634612)
'pw-5.2-rhoxml-piz-daint'
code codecodecode code codecodecode code code codecode code code codecode code code
code
code code
ParameterData (4794604)
parameters
ParameterData (4794605)
settings
StructureData (4781201)
InSe
structure
ParameterData (4795038)
parameters
ParameterData (4795039)
settings
ParameterData (4794529)
parameters
ParameterData (4794530)
settings
StructureData (4781185)
InSe
structure
ParameterData (4793569)
parameters
ParameterData (4793570)
settings
StructureData (4781186)
InSe
structure
ParameterData (4793577)
parameters
ParameterData (4793578)
settings
StructureData (4781187)
InSe
structure
ParameterData (4793590)
parameters
ParameterData (4793591)
settings
StructureData (4781188)
InSe
structure
ParameterData (4794567)
parameters
ParameterData (4794568)
settings
StructureData (4781189)
InSe
structure
ParameterData (4794555)
parameters
ParameterData (4794556)
settings
StructureData (4781190)
InSe
structure
ParameterData (4793583)
parameters
ParameterData (4793584)
settings
StructureData (4781191)
InSe
structure
ParameterData (4794592)
parameters
ParameterData (4794594)
settings
StructureData (4781200)
InSe
structure
ParameterData (4808731)
parameters
ParameterData (4808732)
settings
StructureData (4781174)
InSe
structure
ParameterData (4808737)
parameters
ParameterData (4808738)
settings
StructureData (4781175)
InSe
structure
ParameterData (4793712)
parameters
ParameterData (4793713)
settings
StructureData (4781159)
InSe
structure
ParameterData (4793511)
parameters
ParameterData (4793512)
settings
StructureData (4781160)
InSe
structure
ParameterData (4793526)
parameters
ParameterData (4793527)
settings
StructureData (4781161)
InSe
structure
ParameterData (4793536)
parameters
ParameterData (4793537)
settings
StructureData (4781162)
InSe
structure
ParameterData (4793530)
parameters
ParameterData (4793531)
settings
StructureData (4781163)
InSe
structure
ParameterData (4793542)
parameters
ParameterData (4793543)
settings
StructureData (4781164)
InSe
structure
RemoteData (4793804)
parent_calc_folder
ParameterData (4808325)
parameters
ParameterData (4808326)
settings
StructureData (4781165)
InSe
structure
structure
ParameterData (4794522)
parameters
ParameterData (4794523)
settings
StructureData (4781173)
InSe
structure
output_structure
deformed_structure_8 deformed_structure_7deformed_structure_6 deformed_structure_4 deformed_structure_3deformed_structure_2deformed_structure_1 deformed_structure_0 deformed_structure_9 deformed_structure_9deformed_structure_8 deformed_structure_7 deformed_structure_6 deformed_structure_4deformed_structure_3 deformed_structure_2 deformed_structure_1 deformed_structure_0deformed_structure_10
remote_folder
ParameterData (270684)
parameters
ParameterData (270683)
settings
StructureData (34978)
'3D_with_2D_substructure'
InSe
structure
ParameterData (4781183)
parameters
ParameterData (4781157)
parameters
ParameterData (4793551)
parameters
ParameterData (4793552)
settings
InlineCalculation (34977)
primitive_structure_inline()
primitive_structure_spg
CifData (15308)
cif
ParameterData (45492)
parameters
CiffilterCalculation (37378)
FINISHED
cif
Code (33048)
'cif_select'
code
CifData (3743)
cif
ParameterData (37377)
parameters
CiffilterCalculation (38391)
FINISHED
cif
Code (24766)
'cif_filter'
code
CifData (13415)
cif
ParameterData (38393)
parameters
WHAT REALLY HAPPENS
G. Pizzi, A.C., et al., arXiv:1504.01163
ADES
Sharing in AiiDA
Social ecosystem
Repository pipelines
Standardization
Sharing in AiiDA
Clusters
Users
Databases
Private
data
Public/shared
data
Group 1
Group 3
Group 2
Some
data
shared
Some
data
shared
•  Sharing model in AiiDA
•  Data can be pushed to the
outside world or other
repositories
•  Importer of previous
calculations
•  UUIDs used to uniquely identify all
data/calculation objects
MATERIALS CLOUD
INFRASTRUCTURE
•  server side AiiDA API
•  federated data via iRODs
•  client side API in AngularJS
CONCLUSIONS
l  In computational science, data are naturally
calculated, not harvested
l  ADES model
(automation – data – environment - sharing)
l  AiiDA v1.0 released by end of 2016
l  A DMP is part of (and distributed with) the AiiDA sw
l  AiiDA as a turn-key solution for Data management
Giovanni
Pizzi
(EPFL)
Riccardo
Sabatini
(Hum. Longevity)
Andrea
Cepellotti
(EPFL)
Andrius
Merkys
(Vilnius)
Nicolas
Mounet
(EPFL)
Boris
Kozinsky
(BOSCH)
Martin
Uhrin
(EPFL)
Spyros
Zoupanos
(EPFL)
Snehal
Waychal
(EPFL)
Nicola
Varini
(EPFL)
Leonid
Kahle
(EPFL)
Anton
Kozhevnikov
(CSCS)
Fernando
Gargiulo
(EPFL)
THE AiiDA TEAM
Georgy	Samsonidze,	Prateek	Mehta,	Andrea	Greco	@	Bosch
SUPPORT MOSTLY FROM
http://nccr-marvel.ch
http://www.bosch.us
http://max-centre.eu
http://nffa.eu
http://emmc.info
Handling data and workflows in computational materials science: the AiiDA initiative

More Related Content

What's hot

Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
Sri Ambati
 
Knowledge Graph for Cybersecurity: An Introduction By Kabul Kurniawan
Knowledge Graph for Cybersecurity: An Introduction By  Kabul KurniawanKnowledge Graph for Cybersecurity: An Introduction By  Kabul Kurniawan
Knowledge Graph for Cybersecurity: An Introduction By Kabul Kurniawan
Kabul Kurniawan
 
Data Streaming in Big Data Analysis
Data Streaming in Big Data AnalysisData Streaming in Big Data Analysis
Data Streaming in Big Data Analysis
Vincenzo Gulisano
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
Anubhav Jain
 
Data Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonData Structures for Statistical Computing in Python
Data Structures for Statistical Computing in Python
Wes McKinney
 
Data visualization
Data visualizationData visualization
Data visualization
Moushmi Dasgupta
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
Hektor Jacynycz García
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
Anubhav Jain
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Anubhav Jain
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
Daniel S. Katz
 
Realtime Data Analysis Patterns
Realtime Data Analysis PatternsRealtime Data Analysis Patterns
Realtime Data Analysis Patterns
Mikio L. Braun
 
Scipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonScipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in Python
Wes McKinney
 
Scaling PyData Up and Out
Scaling PyData Up and OutScaling PyData Up and Out
Scaling PyData Up and Out
Travis Oliphant
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
Vijay Srinivas Agneeswaran, Ph.D
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
MLconf
 
Flare: Scale Up Spark SQL with Native Compilation and Set Your Data on Fire! ...
Flare: Scale Up Spark SQL with Native Compilation and Set Your Data on Fire! ...Flare: Scale Up Spark SQL with Native Compilation and Set Your Data on Fire! ...
Flare: Scale Up Spark SQL with Native Compilation and Set Your Data on Fire! ...
Databricks
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Sujit Pal
 
2017 nov reflow sbtb
2017 nov reflow sbtb2017 nov reflow sbtb
2017 nov reflow sbtb
mariuseriksen4
 
parallel OLAP
parallel OLAPparallel OLAP
parallel OLAP
Rim Moussa
 
eScience Cluster Arch. Overview
eScience Cluster Arch. OvervieweScience Cluster Arch. Overview
eScience Cluster Arch. Overview
Francesco Bongiovanni
 

What's hot (20)

Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
 
Knowledge Graph for Cybersecurity: An Introduction By Kabul Kurniawan
Knowledge Graph for Cybersecurity: An Introduction By  Kabul KurniawanKnowledge Graph for Cybersecurity: An Introduction By  Kabul Kurniawan
Knowledge Graph for Cybersecurity: An Introduction By Kabul Kurniawan
 
Data Streaming in Big Data Analysis
Data Streaming in Big Data AnalysisData Streaming in Big Data Analysis
Data Streaming in Big Data Analysis
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
 
Data Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonData Structures for Statistical Computing in Python
Data Structures for Statistical Computing in Python
 
Data visualization
Data visualizationData visualization
Data visualization
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
Realtime Data Analysis Patterns
Realtime Data Analysis PatternsRealtime Data Analysis Patterns
Realtime Data Analysis Patterns
 
Scipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonScipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in Python
 
Scaling PyData Up and Out
Scaling PyData Up and OutScaling PyData Up and Out
Scaling PyData Up and Out
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
 
Flare: Scale Up Spark SQL with Native Compilation and Set Your Data on Fire! ...
Flare: Scale Up Spark SQL with Native Compilation and Set Your Data on Fire! ...Flare: Scale Up Spark SQL with Native Compilation and Set Your Data on Fire! ...
Flare: Scale Up Spark SQL with Native Compilation and Set Your Data on Fire! ...
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
 
2017 nov reflow sbtb
2017 nov reflow sbtb2017 nov reflow sbtb
2017 nov reflow sbtb
 
parallel OLAP
parallel OLAPparallel OLAP
parallel OLAP
 
eScience Cluster Arch. Overview
eScience Cluster Arch. OvervieweScience Cluster Arch. Overview
eScience Cluster Arch. Overview
 

Viewers also liked

Classifieds
ClassifiedsClassifieds
Classifieds
Tina Pallas
 
Interior Design by Luca Bonato_Fusina
Interior Design by Luca Bonato_FusinaInterior Design by Luca Bonato_Fusina
Interior Design by Luca Bonato_Fusina
Fusina
 
viaverde
viaverdeviaverde
viaverde
Ana Super
 
Redcross comic tsunami_lowres
Redcross comic tsunami_lowresRedcross comic tsunami_lowres
Redcross comic tsunami_lowresCSRU
 
Comex report-weekly by epic research singapore 08 nov 2013
Comex report-weekly by epic research singapore 08 nov 2013Comex report-weekly by epic research singapore 08 nov 2013
Comex report-weekly by epic research singapore 08 nov 2013
Epic Research Singapore
 
Adult Therapy Services in San Diego
Adult Therapy Services in San DiegoAdult Therapy Services in San Diego
Adult Therapy Services in San Diego
Jan Rakoff
 
CONTRATACIONES GRUPO ENTREPARENTESIS 2009
CONTRATACIONES GRUPO ENTREPARENTESIS 2009CONTRATACIONES GRUPO ENTREPARENTESIS 2009
CONTRATACIONES GRUPO ENTREPARENTESIS 2009
guest4a899
 
Weed Control Strategies
Weed Control StrategiesWeed Control Strategies
Daily i forex signals report by epicresearch 22th may 2014
Daily i forex signals report by epicresearch 22th may 2014Daily i forex signals report by epicresearch 22th may 2014
Daily i forex signals report by epicresearch 22th may 2014
Epic Research Singapore
 
REALM Elevator Pitch
REALM Elevator PitchREALM Elevator Pitch
REALM Elevator Pitch
Zen Joseph Player
 
Curso: Implementación del Control Interno.
Curso: Implementación del Control Interno.Curso: Implementación del Control Interno.
Curso: Implementación del Control Interno.
RC Consulting
 
Reality Check
Reality CheckReality Check
Reality Check
MEASURE Evaluation
 
PEPFAR’s Experience with Spatial Data Quality: Moving towards spatial data go...
PEPFAR’s Experience with Spatial Data Quality: Moving towards spatial data go...PEPFAR’s Experience with Spatial Data Quality: Moving towards spatial data go...
PEPFAR’s Experience with Spatial Data Quality: Moving towards spatial data go...
MEASURE Evaluation
 
Beyond Dots on a Map: Spatially Modeled Surfaces of DHS data
Beyond Dots on a Map: Spatially Modeled Surfaces of DHS dataBeyond Dots on a Map: Spatially Modeled Surfaces of DHS data
Beyond Dots on a Map: Spatially Modeled Surfaces of DHS data
MEASURE Evaluation
 
Financial plan training and templates
Financial plan training and templatesFinancial plan training and templates
Financial plan training and templates
Aurelien Domont, MBA
 

Viewers also liked (16)

Classifieds
ClassifiedsClassifieds
Classifieds
 
Wi x
Wi xWi x
Wi x
 
Interior Design by Luca Bonato_Fusina
Interior Design by Luca Bonato_FusinaInterior Design by Luca Bonato_Fusina
Interior Design by Luca Bonato_Fusina
 
viaverde
viaverdeviaverde
viaverde
 
Redcross comic tsunami_lowres
Redcross comic tsunami_lowresRedcross comic tsunami_lowres
Redcross comic tsunami_lowres
 
Comex report-weekly by epic research singapore 08 nov 2013
Comex report-weekly by epic research singapore 08 nov 2013Comex report-weekly by epic research singapore 08 nov 2013
Comex report-weekly by epic research singapore 08 nov 2013
 
Adult Therapy Services in San Diego
Adult Therapy Services in San DiegoAdult Therapy Services in San Diego
Adult Therapy Services in San Diego
 
CONTRATACIONES GRUPO ENTREPARENTESIS 2009
CONTRATACIONES GRUPO ENTREPARENTESIS 2009CONTRATACIONES GRUPO ENTREPARENTESIS 2009
CONTRATACIONES GRUPO ENTREPARENTESIS 2009
 
Weed Control Strategies
Weed Control StrategiesWeed Control Strategies
Weed Control Strategies
 
Daily i forex signals report by epicresearch 22th may 2014
Daily i forex signals report by epicresearch 22th may 2014Daily i forex signals report by epicresearch 22th may 2014
Daily i forex signals report by epicresearch 22th may 2014
 
REALM Elevator Pitch
REALM Elevator PitchREALM Elevator Pitch
REALM Elevator Pitch
 
Curso: Implementación del Control Interno.
Curso: Implementación del Control Interno.Curso: Implementación del Control Interno.
Curso: Implementación del Control Interno.
 
Reality Check
Reality CheckReality Check
Reality Check
 
PEPFAR’s Experience with Spatial Data Quality: Moving towards spatial data go...
PEPFAR’s Experience with Spatial Data Quality: Moving towards spatial data go...PEPFAR’s Experience with Spatial Data Quality: Moving towards spatial data go...
PEPFAR’s Experience with Spatial Data Quality: Moving towards spatial data go...
 
Beyond Dots on a Map: Spatially Modeled Surfaces of DHS data
Beyond Dots on a Map: Spatially Modeled Surfaces of DHS dataBeyond Dots on a Map: Spatially Modeled Surfaces of DHS data
Beyond Dots on a Map: Spatially Modeled Surfaces of DHS data
 
Financial plan training and templates
Financial plan training and templatesFinancial plan training and templates
Financial plan training and templates
 

Similar to Handling data and workflows in computational materials science: the AiiDA initiative

FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
Role of python in hpc
Role of python in hpcRole of python in hpc
Role of python in hpc
Dr Reeja S R
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
Anubhav Jain
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaboration
EOSC-hub project
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Herman Wu
 
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
Stian Soiland-Reyes
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
Carole Goble
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
Ian Foster
 
Big Data Architecture for Sensing Applications
Big Data Architecture for Sensing ApplicationsBig Data Architecture for Sensing Applications
Big Data Architecture for Sensing Applications
harshitha kurella
 
Scientific
Scientific Scientific
Scientific
marpierc
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
Ilkay Altintas, Ph.D.
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science research
Anubhav Jain
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
Ian Foster
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
Daniel S. Katz
 
grid computing
grid computinggrid computing
grid computing
elliando dias
 
PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...
Feng Li
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...
Anubhav Jain
 
Reliable, Remote Computation at All Scales
Reliable, Remote Computation at All ScalesReliable, Remote Computation at All Scales
Reliable, Remote Computation at All Scales
Globus
 
Rack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC Supercomputer
Rebekah Rodriguez
 

Similar to Handling data and workflows in computational materials science: the AiiDA initiative (20)

FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Role of python in hpc
Role of python in hpcRole of python in hpc
Role of python in hpc
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaboration
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
 
Big Data Architecture for Sensing Applications
Big Data Architecture for Sensing ApplicationsBig Data Architecture for Sensing Applications
Big Data Architecture for Sensing Applications
 
Scientific
Scientific Scientific
Scientific
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science research
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
grid computing
grid computinggrid computing
grid computing
 
PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...
 
Reliable, Remote Computation at All Scales
Reliable, Remote Computation at All ScalesReliable, Remote Computation at All Scales
Reliable, Remote Computation at All Scales
 
Rack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC Supercomputer
 

More from Research Data Alliance

RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
Research Data Alliance
 
RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
Research Data Alliance
 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
Research Data Alliance
 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
Research Data Alliance
 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
Research Data Alliance
 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
Research Data Alliance
 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
Research Data Alliance
 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
Research Data Alliance
 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
Research Data Alliance
 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
Research Data Alliance
 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
Research Data Alliance
 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
Research Data Alliance
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
Research Data Alliance
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
Research Data Alliance
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
Research Data Alliance
 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
Research Data Alliance
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
Research Data Alliance
 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
Research Data Alliance
 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
Research Data Alliance
 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
Research Data Alliance
 

More from Research Data Alliance (20)

RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
 
RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
 

Recently uploaded

A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 

Recently uploaded (20)

A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 

Handling data and workflows in computational materials science: the AiiDA initiative

  • 1. Andrea Ferretti Handling data and workflows in computational materials science The AiiDA initiative Firenze, 15 Nov 2016
  • 2. -  Highly accurate ab initio methods in electronic structure -  Large computational power required (now available) -  High-throughput screening possible -  Reduced need for exp dat COMPUTATIONAL MATERIALS’ SCIENCE N. Marzari, Nature Materials, Apr 2016 PRL 105, 106601 (2010)
  • 3. COMPUTATIONAL MATERIALS’ SCIENCE G. Hautier et al, Nat Comm 4, 2292 (2013) p-type dopability have already been reported experimentally or computationally for several of them. B6O has been experimentally measured to show p-type conductivity31. It has been demonstrated experimentally that PbZr0.5Ti0.5O3 can be K2Pb2O 4 3 2 Defectformationenergy(eV) 1 0 –1 –2 –3 –4 –2 –3 c 1– Defec Figure 3 | Vacan indicate results fo vacancy formatio are indicated by o 1.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 ZnO SnO2 In2O3 AlCuO2 SrCu2 O2 ZnRh2O4 K2Sn2O3 Sb4 Cl2 O5 K2 Pb2 O3 PbTiO3 Ca4 As2 O Ca4 P2 O Sr4 P2 O Sr4As2O Hg2 SO4 PbZrO3NaNbO2 Tl4 V2 O7 Tl4 O3 ZrSO HfSO B6 O Na2 Sn2 O3 PbHfO3 Band gap (eV) Effectivemass Current p-type TCOs Current n-type TCOs 2 2.5 Figure 2 | Effective mass versus band gap for the p-type TCO candidates. We superposed on the band gap axis a colour spectrum corresponding to the wavelength associated with a photon energy. The TCO candidates are marked with red dots. A few known p-type (blue diamonds) and n-type (green square) TCOs can be compared to the new candidates. The best TCOs should lie in the lower right corner. For clarity, we kept only one representative when polymorphs existed for a given stoechiometry (for
  • 4. -  Highly accurate ab initio methods in electronic structure -  Large computational power required (now available) -  High-throughput screening possible -  Reduced need for exp data -  Data handling needed COMPUTATIONAL MATERIALS’ SCIENCE N. Marzari, Nature Materials, Apr 2016 PRL 105, 106601 (2010)
  • 5. SOME THOUGHTS ON DATA •  In computational science, data are naturally generated, so the workflows that create properties and data from a structure are key •  Curated data are needed (e.g. for verification or for machine learning) •  A model of data-on-demand can be implemented (high-throughput pushes the development of robust workflows to calculate automatically).
  • 6. OBJECTIVES •  Automation: run thousands of calculations daily •  Provenance: all children and all parent data are recorded •  Reproducibility: go back to a simulation years later, and redo it with new parameters or codes •  Extensible/agnostic to models, codes and formats •  Workflows: dynamical, robust, complex “turnkey solutions” that calculate desired properties on demand •  Sharing: provide the distributed environment to disseminate workflows and data and to provide services
  • 7. ADES MODEL FOR COMPUTATIONAL SCIENCE G. Pizzi et al., Comp. Mat. Sci 111, 218-230 (2016) Low-level pillars User-level pillars
  • 9. Automation Data Environment Sharing Automation Database Research environment Social Remote management Provenance Scientific workflows Sharing High-throughput Storage Data analytics Standards A factory A library A scholar A community http://www.aiida.net (MIT BSD, jointly developed with Robert Bosch) G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016)
  • 10. G. Pizzi, A.C., et al., arXiv:1504.01163 ADES Automation in AiiDA Remote management Coupling to data High throughput
  • 11. Automation in AiiDA 1. The core of the code is the AiiDA API (Application Programming Interface), a set of Python classes that exposes the users to the key objects: Calculations, Codes, and Data. What is AiiDA?
  • 12. Automation in AiiDA 2. The AiiDA Object-Relational Mapper (ORM) maps AiiDA objects into Python Classes, so that the objects can be created/modified/queried via an agnostic high-level interface. Any interaction with Storage occurs transparently via Python calls.
  • 13. Automation in AiiDA 3. A daemon manages calculation states (submission, retrieval, parsing…) without user intervention (uses Python celery+supervisor modules), through remote transports and Slurm/PBS Pro/SGE/ Torque plugins.
  • 14. Automation in AiiDA 4. User interactions occurs via the command line tool Verdi, the interactive shell or via Python scripts
  • 15. Coupling automation with storage •  The AiiDA-API acts as the unique interface to heterogeneous, remote HPC resources, that are abstracted away – All work can be done on the local resources, and the user does not need to connect explicitly to remote HPC •  Coupling automation with storage ensures: – uniformity of the input data, usage of codes and computers (the same interface encompasses several supercomputers, different schedulers, connection protocols… – full reproducibility and provenance, with automatic storage of all data and links – seamless sharing of calculations with other users
  • 16. G. Pizzi, A.C., et al., arXiv:1504.01163 ADES Data in AiiDA Storage Database Provenance
  • 17. The Open Provenance Model •  Any calculation is a function, manipulating an input to obtain an output: out1, out2 = F(in1, in2) •  Each functional object is a node in a graph, connected together with directional, labeled links •  Output nodes in turn can be used as inputs of following calculations out1 out2 in1 in2 F data data data data calc
  • 19. Saving the DAGs: Nodes and Links Nodes and links: a graph structure •  Each node: row in a SQL table + folder for files •  Links also stored in a SQL table jobs provenance Transitive closure (TC) table •  Allows queries that traverse the graph •  Automatically updated using triggers •  Queries using TC in SQL faster than with graph DB backends!
  • 20. Benchmark against Neo4j •  Graph databases exist (Neo4j) •  They are still young, while SQL is very mature •  Our benchmark (with postgreSQL) vs. Neo4j on same realistic data, ~11K graphs, ~100K nodes, >1M attributes) AiiDA (query 1 and 2) Neo4j (query 1) Neo4j (query 2) Number of results Query time (s)
  • 21. The AiiDA daemon A daemon runs in the background Calculation state SUBMITTING WITHSCHEDULER RETRIEVING PARSING FINISHED
  • 22. G. Pizzi, A.C., et al., arXiv:1504.01163 ADES Environment in AiiDA High-level workspace Scientific workflows Data analytics
  • 23. Environment in AiiDA: plugins All functionality provided using a plugin interface Calculation Data Parser Transport Scheduler Generation of input files for a given code Quantum Espresso, Phonopy, GPAW, Yambo, NWChem, … Management of data objects for input/output files&folders, parameter sets, remote data, structures, pseudos, ... Parsing of code output and generation of new DB nodes Quantum Espresso, Phonopy, GPAW, Yambo, NWChem, ... How to connect to a cluster local connection, ssh, ... How to interact with the scheduler PBSPro, Torque, SGE, SLURM, ...
  • 24. •  Full python scripting capabilities •  AiiDA manages calculation dependency •  They are modular: users can expand on the workflows of others •  A step can call nested subworkflows. •  Develop turn-key solutions for the calculation of material properties: libraries of workflows Environment in AiiDA: Workflows
  • 25. Workflows features •  Automatic provenance tracking, stored in DB using simple python functions inputs, outputs, function calls stored by adding simple decorator to existing functions •  Serial and parallel execution support can launch long running tasks on separate threads and wait for result when needed •  Control provenance granularity store level of detail relevant to the workflows •  Seamless mixing of local and remote jobs •  Progress checkpointing restart from arbitrary step, retry on failure •  Easy debugging execute workflows in IDE and observe/change states of variables as it runs •  Background execution daemon execution allows machine to be shutdown and continue from last point, essential for running long remote jobs
  • 26. WORKFLOWS ENCODING CORE KNOWLEDGE CHRONOS workflow: electronic-magnetic- atomic structure PHONON workflow: phonon dispersions (+elastic, dielectric) Single q calculation Single q calculation Phonon initialization Energy calculation Input parameters Dynamical matrices Phonon calculation Phonon calculation Single q calculation Collect results Fourier interpolation Phonon dispersion q-points distribution Loops on itself if fails (change parameters) Restart if clean stop (max CPU time reached) Phonon “restart” sub-workflow Testing metallic character Generating structures with random magnetizations Structure Magnetic energy relax. Fully relaxed structure Magnetic energy relax. Magnetic energy relax. Lowest energy configuration Non-magnetic energy relaxation Final energy relaxation + bands Electronic bands Energy calculation + bands Finding magnetic properties Set of tested & converged pseudos (SSSP)
  • 27. InlineCalculation (4825159) elastic_constants_inline() ParameterData (4825161) output_parameters StructureData (4781156) InSe structure ure InlineCalculation (47811 deformation_inline() structure ParameterData (4825155) bestfit_1 ParameterData (4825151) bestfit_0 InlineCalculation (4781155) standardize_structure_inline() standardized_structure InlineCalculation (4825150) best_fit_inline() output_parameters ParameterData (4781154) parameters StructureData (288318) '3D_with_2D_substructure' InSe structure ParameterData (4795936) lagrangian_strain_9 lagrangian_strain_10 ParameterData (4825149) parameters ParameterData (4815214) lagrangian_strain_9 ParameterData (4799987) lagrangian_strain_0 ParameterData (4795620) lagrangian_strain_1 ParameterData (4795368) lagrangian_strain_2 ParameterData (4795979) lagrangian_strain_3 ParameterData (4796167) lagrangian_strain_4 ParameterData (4803103) lagrangian_strain_5 ParameterData (4804527) lagrangian_strain_7 PwCalculation (269210) vc-relax FINISHED output_structure PwCalculation (4808739) relax FINISHED output_parameters output_parameters output_parameters output_parameters PwCalculation (4793538) relax FINISHED output_parameters PwCalculation (4793532) relax FINISHED output_parameters PwCalculation (4793544) relax FINISHED output_parameters output_parameters Code (124499) 'pw-5.2-rhoxml-piz-dora_aprun' code PwCalculation (273242) vc-relax FINISHED code SinglefileData (260128) vdw_table vdw_tablele vdw_table vdw_table vdw_tablevdw_table vdw_table vdw_table vdw_tablevdw_table vdw_table vdw_table vdw_table vdw_table vdw_table vdw_table ParameterData (281907) parameters ParameterData (281906) settings KpointsData (246769) 10x10x2 (+0.0,0.0,0.0) kpoints kpoints kpoints kpoints kpointskpointskpoints kpoints kpoints kpoints kpoints kpoints kpointskpoints kpoints kpoints kpoints UpfData (81898) pseudo_In pseudo_In pseudo_Inpseudo_In pseudo_In pseudo_Inpseudo_Inpseudo_In pseudo_In pseudo_In pseudo_Ipseudo_In pseudo_In pseudo_In pseudo_Se pseudo_Se pseudo_Sepseudo_Se pseudo_Se pseudo_Sepseudo_Se pseudo_Se pseudo_Se pseudo_Se StructureData (272128) '3D_with_2D_substructure' InSe structure Code (4634612) 'pw-5.2-rhoxml-piz-daint' code code code code codecode code code ParameterData (4808737) parameters ParameterData (4808738) settings StructureData (4781175) InSe structure ParameterData (4793536) parameters ParameterData (4793537) settings StructureData (4781162) InSe structure ParameterData (4793530) parameters ParameterData (4793531) settings StructureData (4781163) InSe structure ParameterData (4793542) parameters ParameterD settings StructureData (4781164) InSe structure output_structure deformed_structure_8deformed_structure_3 deformed_structure_2 deformed_structure_1 ParameterData (270684) parameters ParameterData (270683) settings StructureData (34978) '3D_with_2D_substructure' InSe structure ParameterData (4781157) parameters InlineCalculation (34977) primitive_structure_inline() primitive_structure_spg CifData (15308) cif ParameterData (45492) parameters CiffilterCalculation (37378) FINISHED cif Code (33048) 'cif_select' code CifData (3743) cif ParameterData (37377) parameters CiffilterCalculation (38391) FINISHED cif Code (24766) 'cif_filter' code CifData (13415) cif ParameterData (38393) parameters InlineCalculation (4825159) elastic_constants_inline() ParameterData (4825161) output_parameters StructureData (4781156) InSe structure PwCalculation (4795040) relax FINISHED structure InlineCalculation (4781184) deformation_inline() structure InlineCalculation (4781158) deformation_inline() structure ParameterData (4825155) bestfit_1 ParameterData (4825151) bestfit_0 InlineCalculation (4781155) standardize_structure_inline() standardized_structure InlineCalculation (4825154) best_fit_inline() output_parameters InlineCalculation (4825150) best_fit_inline() output_parameters ParameterData (4781154) parameters StructureData (288318) '3D_with_2D_substructure' InSe structure ParameterData (4825153) parameters ParameterData (4803694) lagrangian_strain_8 ParameterData (4795936) lagrangian_strain_9 lagrangian_strain_10 ParameterData (4803201) lagrangian_strain_0 ParameterData (4795358) lagrangian_strain_1 ParameterData (4795328) lagrangian_strain_2 ParameterData (4795562) lagrangian_strain_3 ParameterData (4803536) lagrangian_strain_4 ParameterData (4804359) lagrangian_strain_5 ParameterData (4803560) lagrangian_strain_6 ParameterData (4804374) lagrangian_strain_7 ParameterData (4825149) parameters ParameterData (4815189) lagrangian_strain_8ParameterData (4815214) lagrangian_strain_9 ParameterData (4799987) lagrangian_strain_0 ParameterData (4795620) lagrangian_strain_1 ParameterData (4795368) lagrangian_strain_2 ParameterData (4795979) lagrangian_strain_3 ParameterData (4796167) lagrangian_strain_4 ParameterData (4803103) lagrangian_strain_5 ParameterData (4812455) lagrangian_strain_6 ParameterData (4804527) lagrangian_strain_7 PwCalculation (269210) vc-relax FINISHED output_structure PwCalculation (4794607) relax FINISHED output_parameters output_parameters PwCalculation (4794531) relax FINISHED output_parameters PwCalculation (4793571) relax FINISHED output_parameters PwCalculation (4793579) relax FINISHED output_parameters PwCalculation (4793592) relax FINISHED output_parameters PwCalculation (4794569) relax FINISHED output_parameters PwCalculation (4794557) relax FINISHED output_parameters PwCalculation (4793585) relax FINISHED output_parameters PwCalculation (4794595) relax FINISHED output_parameters PwCalculation (4808733) relax FINISHED output_parameters PwCalculation (4808739) relax FINISHED output_parameters PwCalculation (4793714) relax FINISHED output_parameters PwCalculation (4793514) relax FINISHED output_parameters PwCalculation (4793529) relax FINISHED output_parameters PwCalculation (4793538) relax FINISHED output_parameters PwCalculation (4793532) relax FINISHED output_parameters PwCalculation (4793544) relax FINISHED output_parameters PwCalculation (4808327) relax FINISHED output_parameters PwCalculation (4794524) relax FINISHED output_parameters Code (124499) 'pw-5.2-rhoxml-piz-dora_aprun' code PwCalculation (273242) vc-relax FINISHED code SinglefileData (260128) vdw_table vdw_table vdw_tablevdw_tablevdw_table vdw_table vdw_tablevdw_tablevdw_table vdw_table vdw_table vdw_tablevdw_table vdw_table vdw_table vdw_tablevdw_table vdw_table vdw_table vdw_table vdw_table vdw_table PwCalculation (4793553) relax FAILED vdw_table ParameterData (281907) parameters ParameterData (281906) settings KpointsData (246769) 10x10x2 (+0.0,0.0,0.0) kpoints kpoints kpointskpointskpoints kpoints kpointskpointskpoints kpoints kpoints kpointskpoints kpoints kpoints kpointskpoints kpoints kpoints kpoints kpoints kpoints kpoints UpfData (81898) pseudo_In pseudo_In pseudo_Inpseudo_Inpseudo_In pseudo_In pseudo_Inpseudo_Inpseudo_In pseudo_In pseudo_In pseudo_Inpseudo_In pseudo_In pseudo_In pseudo_Inpseudo_In pseudo_In pseudo_In pseudo_In pseudo_In pseudo_In pseudo_In UpfData (95553) pseudo_Se pseudo_Se pseudo_Sepseudo_Sepseudo_Se pseudo_Se pseudo_Sepseudo_Sepseudo_Se pseudo_Se pseudo_Se pseudo_Sepseudo_Se pseudo_Se pseudo_Se pseudo_Sepseudo_Se pseudo_Se pseudo_Se pseudo_Se pseudo_Se pseudo_Se pseudo_Se StructureData (272128) '3D_with_2D_substructure' InSe structure Code (4634612) 'pw-5.2-rhoxml-piz-daint' code codecodecode code codecodecode code code codecode code code codecode code code code code code ParameterData (4794604) parameters ParameterData (4794605) settings StructureData (4781201) InSe structure ParameterData (4795038) parameters ParameterData (4795039) settings ParameterData (4794529) parameters ParameterData (4794530) settings StructureData (4781185) InSe structure ParameterData (4793569) parameters ParameterData (4793570) settings StructureData (4781186) InSe structure ParameterData (4793577) parameters ParameterData (4793578) settings StructureData (4781187) InSe structure ParameterData (4793590) parameters ParameterData (4793591) settings StructureData (4781188) InSe structure ParameterData (4794567) parameters ParameterData (4794568) settings StructureData (4781189) InSe structure ParameterData (4794555) parameters ParameterData (4794556) settings StructureData (4781190) InSe structure ParameterData (4793583) parameters ParameterData (4793584) settings StructureData (4781191) InSe structure ParameterData (4794592) parameters ParameterData (4794594) settings StructureData (4781200) InSe structure ParameterData (4808731) parameters ParameterData (4808732) settings StructureData (4781174) InSe structure ParameterData (4808737) parameters ParameterData (4808738) settings StructureData (4781175) InSe structure ParameterData (4793712) parameters ParameterData (4793713) settings StructureData (4781159) InSe structure ParameterData (4793511) parameters ParameterData (4793512) settings StructureData (4781160) InSe structure ParameterData (4793526) parameters ParameterData (4793527) settings StructureData (4781161) InSe structure ParameterData (4793536) parameters ParameterData (4793537) settings StructureData (4781162) InSe structure ParameterData (4793530) parameters ParameterData (4793531) settings StructureData (4781163) InSe structure ParameterData (4793542) parameters ParameterData (4793543) settings StructureData (4781164) InSe structure RemoteData (4793804) parent_calc_folder ParameterData (4808325) parameters ParameterData (4808326) settings StructureData (4781165) InSe structure structure ParameterData (4794522) parameters ParameterData (4794523) settings StructureData (4781173) InSe structure output_structure deformed_structure_8 deformed_structure_7deformed_structure_6 deformed_structure_4 deformed_structure_3deformed_structure_2deformed_structure_1 deformed_structure_0 deformed_structure_9 deformed_structure_9deformed_structure_8 deformed_structure_7 deformed_structure_6 deformed_structure_4deformed_structure_3 deformed_structure_2 deformed_structure_1 deformed_structure_0deformed_structure_10 remote_folder ParameterData (270684) parameters ParameterData (270683) settings StructureData (34978) '3D_with_2D_substructure' InSe structure ParameterData (4781183) parameters ParameterData (4781157) parameters ParameterData (4793551) parameters ParameterData (4793552) settings InlineCalculation (34977) primitive_structure_inline() primitive_structure_spg CifData (15308) cif ParameterData (45492) parameters CiffilterCalculation (37378) FINISHED cif Code (33048) 'cif_select' code CifData (3743) cif ParameterData (37377) parameters CiffilterCalculation (38391) FINISHED cif Code (24766) 'cif_filter' code CifData (13415) cif ParameterData (38393) parameters WHAT REALLY HAPPENS
  • 28. G. Pizzi, A.C., et al., arXiv:1504.01163 ADES Sharing in AiiDA Social ecosystem Repository pipelines Standardization
  • 29. Sharing in AiiDA Clusters Users Databases Private data Public/shared data Group 1 Group 3 Group 2 Some data shared Some data shared •  Sharing model in AiiDA •  Data can be pushed to the outside world or other repositories •  Importer of previous calculations •  UUIDs used to uniquely identify all data/calculation objects
  • 30. MATERIALS CLOUD INFRASTRUCTURE •  server side AiiDA API •  federated data via iRODs •  client side API in AngularJS
  • 31. CONCLUSIONS l  In computational science, data are naturally calculated, not harvested l  ADES model (automation – data – environment - sharing) l  AiiDA v1.0 released by end of 2016 l  A DMP is part of (and distributed with) the AiiDA sw l  AiiDA as a turn-key solution for Data management