SlideShare a Scribd company logo
-An open source not for profit project
-On GitHub ‘DawnScience’
- Diamond Light Source Ltd. and the ESRF are
largely publically funded research facilities
Who
– ORNL
– DAWN Collaboration Members
– IBM
– IFP Energies Nouvelles
– German Aerospace Center
– MARINTEK / Itema
– Oak Ridge National Laboratory
– Paul Scherrer Institute
– University of Hamburg - Chemclipse
– Uppsala University - Bioclipse
– Others not on the steering committee
• Charter including a vision for the future
• Applying for funding for a full time
developer
• Contributions from DAWN and other
members being made.
• Presentations in the US (2012) and
Germany (2013)
• Eclipse Foundation investing in the group
• Git / Jenkins / Marketing
Collaborations
Science Working Group
Disclaimer
AKA - who says that?
• A Java Software Developer (not a Scientist) worked for 16 years
with various Java based applications in science and
engineering
• I will attempt to explain a bit of the science for your enjoyment
(hopefully not schadenfreude).
• Talk biased towards how Diamond and the ESRF are using Ptolemy
2
• An Eclipse/RCP fan
Matthew Gerring
Synchrotron
AKA – cool word, but what does it mean?
syn·chro·tron/ˈsiNGkrəˌträn/
Noun:A cyclotron in which the magnetic field
strength increases with the energy of the
particles to keep their orbital radius constant.
“They are machines which produce
very strong light used for many
different type of scientific experiments
and sometimes other things.”
ESRF
(Experimental Facility)
Diamond
(User Facility)
The Queen and Duke of Edinburgh
at the official opening of DLS, 19th
October 2007
Inside the storage
ring [not star-trek
“conduit”...]
Scientists with some
of the hardware used
in their research
Video of Diamond...
Detectors of
Various
flavours
Responsibilities
AKA – what developers do at Diamond...
• Software for controlling experiments
– Motors, detectors, configuration.
– A high quality and flexible GUI.
– Ptolemy 2 not currently used.
– Data collection scripts
• Software for data
– Ability to visually interact with n-dimensional data (i.e. graphs and slices).
– Ability to write scripts to interact with data.
– Custom user interface and forms for specific experiments.
Software for running analysis pipelines
• Hard coded and/or user configurable options.
• Real time visualization of analysed results.
• Ptolemy 2
Integration Tools
AKA – how we are getting it done
• Eclipse IDE - around 20 developers, 8 in scientific software
– Controls currently in process of migrating to RCP (~15 more developers)
• Eclipse RCP product built using Buckminster (previously PDE)
• Usage of Jenkins for continuous integration.
• Unit tests using Squish UI Testing, Junit and Junit plugin tests.
• We do not currently do code walkthroughs or pair programming. Agile
practices being used where otherwise possible.
• We document our designs and code using confluence.
• We use Cheat Sheets for tutorials and testing guides.
• Source code control using Git/eGit (which has a pure Java client)
‘Shoulders of Giants’
• RCP many of the core features, editors, toolbars, views, projects
• Ptolemy 2 (a version known as ‘Passerelle’) workflow and pipelining
• GEF for visualization of pipeline graphs
• Draw2D for 1D and 2D plotting (SWT XY Graph)
• Pydev for python/jython scripting layer used by the scientists
• HDF5 libraries for storing large data sets
• SWT/Jface – lazy viewers being used extensively for large trees and tables
• Apache, Eclipse-WST, springsource, JDK, and many more of course...
– For images
• Line, Box, Sector integration
• Diffraction image interpretation, line profile for ‘D-spacing’
• Color mapping / Histogramming
• Pixel Information and region control
– For XY Graphs
• Peak Fitting and Line Fitting
• Derivative and other functions, including user defined
• Scientific tools
– XAFS Analysis Tool
– SAXS
– Use of eclipse architecture, extension points and pages inside
PageBookView.
Lots of Visual Tools
Demonstration – Visual Tools
Example showing various visual tools
• Cutting through N-dimensional data
– With an XY plot
– As an image
– As a 3D iso-surface
– Hyper 3D
• Important to run everything concurrently
– Use of Jobs
– Use of ordinary threads
– Use of blocking queues
Slicing data
Demonstration – Slicing and dicing
Example opening a tomography file and slicing it
Passerelle Origins
• Passerelle is a Ptolemy 2 based framework produced open source by
Isencia Belgium.
• Passerelle has Swing, HTML5 and SWT/RCP versions today
1. Passerelle using Ptolemy 2 by extension / customization for
projects in telecommunications
2. Passerelle first used at the Soleil synchrotron - in its Swing
incarnation
3. A project completed with the ESRF to convert Passerelle UI to
SWT in the RCP/Eclipse platform
4. The DAWN project incorporates ESRF work and creates a
new custom message to pass around actors.
Common Message
• Messages passed between actors are complex
• Passerelle define a message with a header
• DAWN send multiple scalars and list values between actors in
one message
• This enables graphs to be simplified at the expense of
flexibility
Demonstration – Simple Matrix Maths
Add, subtract – etc some images produced by an experiment...
I05 ARPES Beamline
• Angle-Resolved
PhotoEmission
Spectroscopy
• Used to look at
the Electron
properties on
surfaces.
I05 ARPES Why Workflows?
• Easier to work with beamline scientists, it
seems less like black box data processing.
• Rich data message makes components reusable
• Individual plugins are more testable and stable.
I05 ARPES User Interaction
• Users interact with this through a front end,
and never see the workflow behind.
Cluster Project
Non-crystalline diffraction beamlines have an
existing algorithm in Ptolemy 2 / Passerelle:
- On i7 ~120 images take 4 minutes to process
- Image stack processed in parallel using load
balancing (Fork/Join Java 7)
We would like to run this FAST
- Split stack into chunks
- Process chunks on cluster nodes
- Cluster node actor to process chunks
Use of JMS and DRMAA planned
Sub-Model
Cluster Node
Transformer
[NEW]
File Input
Source
Data Export Sink
In Out
Loop
Fit
Transformer
Load Balance
Transformer
[NEW]
Filter
Transformer
In Out
Loop
Load Balance
Cluster chunks
Future of DAWN (wrt Ptolemy)
• New RCP workflow editor using Graphiti
– New routing options
– Improved graphical layer and tools
– eclipse.org/graphiti/
• Cluster connectivity
– Load balancing actor
– Cluster node actor based on DRMAA drmaa.org
• Increased support for data regions and functions
Ptolemy 2 (Questions about the) Future
...Brainstorming
• Usage of the Fork/Join capability in Java 7?
• How to make best use of Lambda functions in Java 8?
• How is the Kepler RCP project going (is there one)?
• Can we collaborate in the future between Kepler and DAWN or
Passerelle?
• Would you like to join the Science Working Group at Eclipse?
• There is a DAWN workshop in June 2014 to which you would be
welcome.
• Thanks to Ptolemy 2 and Passerelle for their API which
has been useful for our workflows feature.
• Thanks to Eclipse for providing a great tool
– RCP is fast and scalable too, using OSGI
– SWT has ability to be configured for very large data
– Ability to integrate native code in plugins if needed
– Maybe we can support web application with RAP one day
• Thanks to the Java community for its APIs
Conclusion
Diamond Light Source Ltd. www.diamond.ac.uk
ESRF www.esrf.fr
Data Analysis Workbench, www.dawnsci.org

More Related Content

What's hot

Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
inside-BigData.com
 
Big Data Analytics-Open Source Toolkits
Big Data Analytics-Open Source ToolkitsBig Data Analytics-Open Source Toolkits
Big Data Analytics-Open Source Toolkits
DataWorks Summit
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
Chris Fregly
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)
Nikhil Garg
 

What's hot (20)

Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
 
Micro-Benchmarking Considered Harmful
Micro-Benchmarking Considered HarmfulMicro-Benchmarking Considered Harmful
Micro-Benchmarking Considered Harmful
 
Márton Balassi Streaming ML with Flink-
Márton Balassi Streaming ML with Flink- Márton Balassi Streaming ML with Flink-
Márton Balassi Streaming ML with Flink-
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
Big Data Analytics-Open Source Toolkits
Big Data Analytics-Open Source ToolkitsBig Data Analytics-Open Source Toolkits
Big Data Analytics-Open Source Toolkits
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
 
Enabling Composition in Distributed Reinforcement Learning with Ray RLlib wit...
Enabling Composition in Distributed Reinforcement Learning with Ray RLlib wit...Enabling Composition in Distributed Reinforcement Learning with Ray RLlib wit...
Enabling Composition in Distributed Reinforcement Learning with Ray RLlib wit...
 
Graal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them AllGraal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them All
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R Meetup
 
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
 
scilab
scilabscilab
scilab
 
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote) FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
 
Data Intensive Applications with Apache Flink
Data Intensive Applications with Apache FlinkData Intensive Applications with Apache Flink
Data Intensive Applications with Apache Flink
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
 
Data science on big data. Pragmatic approach
Data science on big data. Pragmatic approachData science on big data. Pragmatic approach
Data science on big data. Pragmatic approach
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
 
AICamp - Dr Ramine Tinati - Making Computer Vision Real
AICamp - Dr Ramine Tinati - Making Computer Vision RealAICamp - Dr Ramine Tinati - Making Computer Vision Real
AICamp - Dr Ramine Tinati - Making Computer Vision Real
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019
 

Similar to DAWN and Scientific Workflows

Similar to DAWN and Scientific Workflows (20)

Eclipse RCP for Synchrotron Science
Eclipse RCP for Synchrotron ScienceEclipse RCP for Synchrotron Science
Eclipse RCP for Synchrotron Science
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOS
 
PDE2011 pythonOCC project status and plans
PDE2011 pythonOCC project status and plansPDE2011 pythonOCC project status and plans
PDE2011 pythonOCC project status and plans
 
Solum - OpenStack PaaS / ALM
Solum - OpenStack PaaS / ALMSolum - OpenStack PaaS / ALM
Solum - OpenStack PaaS / ALM
 
eScience Cluster Arch. Overview
eScience Cluster Arch. OvervieweScience Cluster Arch. Overview
eScience Cluster Arch. Overview
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
 
The Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platformThe Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platform
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
 
Deep Learning for Java Developer - Getting Started
Deep Learning for Java Developer - Getting StartedDeep Learning for Java Developer - Getting Started
Deep Learning for Java Developer - Getting Started
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
 
Contributing to OpenStack
Contributing to OpenStackContributing to OpenStack
Contributing to OpenStack
 
Scilab Challenge@NTU 2014/2015 Project Briefing
Scilab Challenge@NTU 2014/2015 Project BriefingScilab Challenge@NTU 2014/2015 Project Briefing
Scilab Challenge@NTU 2014/2015 Project Briefing
 
Node.js an Exectutive View
Node.js an Exectutive ViewNode.js an Exectutive View
Node.js an Exectutive View
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylight
 
Eclipse Overview
Eclipse Overview Eclipse Overview
Eclipse Overview
 
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
 
Clojure in real life 17.10.2014
Clojure in real life 17.10.2014Clojure in real life 17.10.2014
Clojure in real life 17.10.2014
 
Role of python in hpc
Role of python in hpcRole of python in hpc
Role of python in hpc
 

More from Matthew Gerring (6)

Presented at GeoCon 2015
Presented at GeoCon 2015Presented at GeoCon 2015
Presented at GeoCon 2015
 
Trondheim Eclipe Day 2015 and 2016
Trondheim Eclipe Day 2015 and 2016Trondheim Eclipe Day 2015 and 2016
Trondheim Eclipe Day 2015 and 2016
 
Geoscience and Microservices
Geoscience and Microservices Geoscience and Microservices
Geoscience and Microservices
 
Demo eclipse science
Demo eclipse scienceDemo eclipse science
Demo eclipse science
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science Project
 
Demo Eclipse Science
Demo Eclipse ScienceDemo Eclipse Science
Demo Eclipse Science
 

Recently uploaded

FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
Sérgio Sacani
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
Jocelyn Atis
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 

Recently uploaded (20)

Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSE
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
NuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent UniversityNuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent University
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of Bengal
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on Earth
 

DAWN and Scientific Workflows

  • 1. -An open source not for profit project -On GitHub ‘DawnScience’ - Diamond Light Source Ltd. and the ESRF are largely publically funded research facilities
  • 2. Who – ORNL – DAWN Collaboration Members – IBM – IFP Energies Nouvelles – German Aerospace Center – MARINTEK / Itema – Oak Ridge National Laboratory – Paul Scherrer Institute – University of Hamburg - Chemclipse – Uppsala University - Bioclipse – Others not on the steering committee • Charter including a vision for the future • Applying for funding for a full time developer • Contributions from DAWN and other members being made. • Presentations in the US (2012) and Germany (2013) • Eclipse Foundation investing in the group • Git / Jenkins / Marketing Collaborations Science Working Group
  • 3. Disclaimer AKA - who says that? • A Java Software Developer (not a Scientist) worked for 16 years with various Java based applications in science and engineering • I will attempt to explain a bit of the science for your enjoyment (hopefully not schadenfreude). • Talk biased towards how Diamond and the ESRF are using Ptolemy 2 • An Eclipse/RCP fan Matthew Gerring
  • 4. Synchrotron AKA – cool word, but what does it mean? syn·chro·tron/ˈsiNGkrəˌträn/ Noun:A cyclotron in which the magnetic field strength increases with the energy of the particles to keep their orbital radius constant. “They are machines which produce very strong light used for many different type of scientific experiments and sometimes other things.”
  • 6. The Queen and Duke of Edinburgh at the official opening of DLS, 19th October 2007
  • 7. Inside the storage ring [not star-trek “conduit”...] Scientists with some of the hardware used in their research Video of Diamond...
  • 8.
  • 9.
  • 11. Responsibilities AKA – what developers do at Diamond... • Software for controlling experiments – Motors, detectors, configuration. – A high quality and flexible GUI. – Ptolemy 2 not currently used. – Data collection scripts • Software for data – Ability to visually interact with n-dimensional data (i.e. graphs and slices). – Ability to write scripts to interact with data. – Custom user interface and forms for specific experiments. Software for running analysis pipelines • Hard coded and/or user configurable options. • Real time visualization of analysed results. • Ptolemy 2
  • 12. Integration Tools AKA – how we are getting it done • Eclipse IDE - around 20 developers, 8 in scientific software – Controls currently in process of migrating to RCP (~15 more developers) • Eclipse RCP product built using Buckminster (previously PDE) • Usage of Jenkins for continuous integration. • Unit tests using Squish UI Testing, Junit and Junit plugin tests. • We do not currently do code walkthroughs or pair programming. Agile practices being used where otherwise possible. • We document our designs and code using confluence. • We use Cheat Sheets for tutorials and testing guides. • Source code control using Git/eGit (which has a pure Java client)
  • 13. ‘Shoulders of Giants’ • RCP many of the core features, editors, toolbars, views, projects • Ptolemy 2 (a version known as ‘Passerelle’) workflow and pipelining • GEF for visualization of pipeline graphs • Draw2D for 1D and 2D plotting (SWT XY Graph) • Pydev for python/jython scripting layer used by the scientists • HDF5 libraries for storing large data sets • SWT/Jface – lazy viewers being used extensively for large trees and tables • Apache, Eclipse-WST, springsource, JDK, and many more of course...
  • 14.
  • 15. – For images • Line, Box, Sector integration • Diffraction image interpretation, line profile for ‘D-spacing’ • Color mapping / Histogramming • Pixel Information and region control – For XY Graphs • Peak Fitting and Line Fitting • Derivative and other functions, including user defined • Scientific tools – XAFS Analysis Tool – SAXS – Use of eclipse architecture, extension points and pages inside PageBookView. Lots of Visual Tools
  • 16. Demonstration – Visual Tools Example showing various visual tools
  • 17. • Cutting through N-dimensional data – With an XY plot – As an image – As a 3D iso-surface – Hyper 3D • Important to run everything concurrently – Use of Jobs – Use of ordinary threads – Use of blocking queues Slicing data
  • 18. Demonstration – Slicing and dicing Example opening a tomography file and slicing it
  • 19. Passerelle Origins • Passerelle is a Ptolemy 2 based framework produced open source by Isencia Belgium. • Passerelle has Swing, HTML5 and SWT/RCP versions today 1. Passerelle using Ptolemy 2 by extension / customization for projects in telecommunications 2. Passerelle first used at the Soleil synchrotron - in its Swing incarnation 3. A project completed with the ESRF to convert Passerelle UI to SWT in the RCP/Eclipse platform 4. The DAWN project incorporates ESRF work and creates a new custom message to pass around actors.
  • 20. Common Message • Messages passed between actors are complex • Passerelle define a message with a header • DAWN send multiple scalars and list values between actors in one message • This enables graphs to be simplified at the expense of flexibility
  • 21. Demonstration – Simple Matrix Maths Add, subtract – etc some images produced by an experiment...
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28. I05 ARPES Beamline • Angle-Resolved PhotoEmission Spectroscopy • Used to look at the Electron properties on surfaces.
  • 29. I05 ARPES Why Workflows? • Easier to work with beamline scientists, it seems less like black box data processing. • Rich data message makes components reusable • Individual plugins are more testable and stable.
  • 30. I05 ARPES User Interaction • Users interact with this through a front end, and never see the workflow behind.
  • 31. Cluster Project Non-crystalline diffraction beamlines have an existing algorithm in Ptolemy 2 / Passerelle: - On i7 ~120 images take 4 minutes to process - Image stack processed in parallel using load balancing (Fork/Join Java 7) We would like to run this FAST - Split stack into chunks - Process chunks on cluster nodes - Cluster node actor to process chunks Use of JMS and DRMAA planned
  • 32.
  • 33.
  • 34. Sub-Model Cluster Node Transformer [NEW] File Input Source Data Export Sink In Out Loop Fit Transformer Load Balance Transformer [NEW] Filter Transformer In Out Loop Load Balance Cluster chunks
  • 35. Future of DAWN (wrt Ptolemy) • New RCP workflow editor using Graphiti – New routing options – Improved graphical layer and tools – eclipse.org/graphiti/ • Cluster connectivity – Load balancing actor – Cluster node actor based on DRMAA drmaa.org • Increased support for data regions and functions
  • 36. Ptolemy 2 (Questions about the) Future ...Brainstorming • Usage of the Fork/Join capability in Java 7? • How to make best use of Lambda functions in Java 8? • How is the Kepler RCP project going (is there one)? • Can we collaborate in the future between Kepler and DAWN or Passerelle? • Would you like to join the Science Working Group at Eclipse? • There is a DAWN workshop in June 2014 to which you would be welcome.
  • 37. • Thanks to Ptolemy 2 and Passerelle for their API which has been useful for our workflows feature. • Thanks to Eclipse for providing a great tool – RCP is fast and scalable too, using OSGI – SWT has ability to be configured for very large data – Ability to integrate native code in plugins if needed – Maybe we can support web application with RAP one day • Thanks to the Java community for its APIs Conclusion Diamond Light Source Ltd. www.diamond.ac.uk ESRF www.esrf.fr Data Analysis Workbench, www.dawnsci.org