Making Sense of Information Through Planetary Scale ComputingLarry Smarr
09.03.01
Invited Presentation to the
Diamond Exchange—Brave New World
Title: Making Sense of Information Through Planetary Scale Computing
Monterey, CA
This talk will provide an introduction to the DReAMS reserach line at NECSTLab. At NECSTLab we are working at developing a Coursera specialization. The set of four courses will introduce the students to the FPGA technologies, to the concept of reconfigurability in FPGAs, presenting the available mechanisms and technologies at the device level and the tools and design methodologies required to design FPGA-based computing systems. The course will present the different aspects of the design of FPGA-based systems, starting from basic knowledge to advanced design methodologies to implement complex design via SDAccel on Amazon AWS F1 instances. This talk will start describing the work done so far and the future plans in realizing the specialization.
We will then focus on two research projects that will be also used during the online classes.
We will first present CAOS, a framework which helps the application designer in identifying acceleration opportunities and guides through the implementation of the final FPGA-based system. The CAOS platform targets the full stack of the application optimization process, starting from the identification of the kernel functions to accelerate, to the optimization of such kernels and to the generation of the runtime management and the configuration files needed to program the FPGA. After CAOS will present the HUGenomics projects. The unique genetic profile of a species is leading to the development of customized treatments, from personalized medicine to agrigenomics, but the exponential growth of available genomic data requires a computational effort that may limit the progress of these fields. The HUGenomics framework aims at facilitating genome assembly process by means of both hardware accelerated algorithms and scientific data visualization tools. Indeed, the system raises the level of abstraction allowing users to easily integrate custom algorithms into the hardware pipeline without any knowledge of the underneath architecture.
Making Sense of Information Through Planetary Scale ComputingLarry Smarr
09.03.01
Invited Presentation to the
Diamond Exchange—Brave New World
Title: Making Sense of Information Through Planetary Scale Computing
Monterey, CA
This talk will provide an introduction to the DReAMS reserach line at NECSTLab. At NECSTLab we are working at developing a Coursera specialization. The set of four courses will introduce the students to the FPGA technologies, to the concept of reconfigurability in FPGAs, presenting the available mechanisms and technologies at the device level and the tools and design methodologies required to design FPGA-based computing systems. The course will present the different aspects of the design of FPGA-based systems, starting from basic knowledge to advanced design methodologies to implement complex design via SDAccel on Amazon AWS F1 instances. This talk will start describing the work done so far and the future plans in realizing the specialization.
We will then focus on two research projects that will be also used during the online classes.
We will first present CAOS, a framework which helps the application designer in identifying acceleration opportunities and guides through the implementation of the final FPGA-based system. The CAOS platform targets the full stack of the application optimization process, starting from the identification of the kernel functions to accelerate, to the optimization of such kernels and to the generation of the runtime management and the configuration files needed to program the FPGA. After CAOS will present the HUGenomics projects. The unique genetic profile of a species is leading to the development of customized treatments, from personalized medicine to agrigenomics, but the exponential growth of available genomic data requires a computational effort that may limit the progress of these fields. The HUGenomics framework aims at facilitating genome assembly process by means of both hardware accelerated algorithms and scientific data visualization tools. Indeed, the system raises the level of abstraction allowing users to easily integrate custom algorithms into the hardware pipeline without any knowledge of the underneath architecture.
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersAlan Sill
Learn about standards studied in the US National Science Foundation Cloud and Autonomic Computing Industry/University Cooperative Research Center Cloud Standards Testing Lab and how you can get involved to extend the successes from these results in your own cloud software settings. Presented at the O'Reilly OSCON 2014 Open Cloud Day.
Video available at https://www.youtube.com/watch?v=eD2h0SqC7tY
06.07.26
Invited Talk
Cyberinfrastructure for Humanities, Arts, and Social Sciences, A Summer Institute, SDSC
Title: The OptIPuter and Its Applications
La Jolla, CA
Future Internet: Managing Innovation and TestbedShinji Shimojo
Innovation is a big key word for ICT research and development. However, a road toward innovation is facing full of uncertainties and there are many obstacles. key elements to overcome these obstacles seems to be agile management of people, software and hardware. In addition, we think involvement of users in R&D will have much effect on the management of uncertainty in R&D. In this talk, I talk on our approach to this user involvement in JGN-X, an international future internet testbed and Knowledge Capital, Osaka, an smart city experimental testbed.
How Global-Scale Personal Lighwaves are Transforming Scientific ResearchLarry Smarr
07.03.08
Speaker
Distinguished Lecturer Series
Department of Computer Science
Title: How Global-Scale Personal Lighwaves are Transforming Scientific Research
UC Davis
OCRE Workshop: Shaping the Earth Observation Services Market for Research. Session 3: Presentations from DIAS and eoMALL.
This workshop aims to bring the EO service providers closer to the research community, capture their needs and develop fit for purpose EO services.
The event will be the 4th OCRE Requirements Gathering Workshop. Researchers and Earth Observation Service Providers will be asked to provide inputs to help us shape OCRE's tender.
The OCRE project aims to provide the first end-to-end instance of organised, large-scale market pull for EO services in Europe. These services will be provided for free to EU researchers through the European Open Science Cloud. To ensure that the services meet the actual needs of the research community we invite both the demand and the supply side, to share their views and engage in a productive dialogue. Our aim is to capture the needs of EU researchers and inform the EO service providers so that they make available services that effectively address them. We will also explain how the OCRE process will work, how the different stakeholders should be involved and how to make the most of the foreseen benefits.
On 29 January 2020 ARCHIVER launched its Request for Tender with the purpose to award several Framework Agreements and work orders for the provision of R&D for hybrid end-to-end archival and preservation services that meet the innovation challenges of European Research communities, in the context of the European Open Science Cloud.
The tender was closed on 28 April 2020 and 15 R&D bids were submitted, with consortia that included 43 companies and organisations. The best bids have been selected and will start the first phase of the ARCHIVER R&D (Solution Design) in June 2020.
On Monday 8 June the selected consortia for the ARCHIVER design phase have been announced during a Public Award Ceremony starting at 14.00 CEST.
In light of the COVID-19 outbreak and the and consequent movement restrictions imposed in several countries, the event has been organised as a webinar, virtually hosted by Port d’Informació Científica (PIC), a member of the Buyers Group of the ARCHIVER consortium.
The Kick-off marks the beginning of the Solution Design Phase.
Presentation slides for SDCSB Cytoscape Workshop on 5/19/2016. The presentation contains current status of Cytoscape project and overview of the Cytoscape ecosystem. It briefly mentions the Cytoscape Cyberinfrastructure.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
More Related Content
Similar to Structural Biology in the Clouds: A Success Story of 10 years
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersAlan Sill
Learn about standards studied in the US National Science Foundation Cloud and Autonomic Computing Industry/University Cooperative Research Center Cloud Standards Testing Lab and how you can get involved to extend the successes from these results in your own cloud software settings. Presented at the O'Reilly OSCON 2014 Open Cloud Day.
Video available at https://www.youtube.com/watch?v=eD2h0SqC7tY
06.07.26
Invited Talk
Cyberinfrastructure for Humanities, Arts, and Social Sciences, A Summer Institute, SDSC
Title: The OptIPuter and Its Applications
La Jolla, CA
Future Internet: Managing Innovation and TestbedShinji Shimojo
Innovation is a big key word for ICT research and development. However, a road toward innovation is facing full of uncertainties and there are many obstacles. key elements to overcome these obstacles seems to be agile management of people, software and hardware. In addition, we think involvement of users in R&D will have much effect on the management of uncertainty in R&D. In this talk, I talk on our approach to this user involvement in JGN-X, an international future internet testbed and Knowledge Capital, Osaka, an smart city experimental testbed.
How Global-Scale Personal Lighwaves are Transforming Scientific ResearchLarry Smarr
07.03.08
Speaker
Distinguished Lecturer Series
Department of Computer Science
Title: How Global-Scale Personal Lighwaves are Transforming Scientific Research
UC Davis
OCRE Workshop: Shaping the Earth Observation Services Market for Research. Session 3: Presentations from DIAS and eoMALL.
This workshop aims to bring the EO service providers closer to the research community, capture their needs and develop fit for purpose EO services.
The event will be the 4th OCRE Requirements Gathering Workshop. Researchers and Earth Observation Service Providers will be asked to provide inputs to help us shape OCRE's tender.
The OCRE project aims to provide the first end-to-end instance of organised, large-scale market pull for EO services in Europe. These services will be provided for free to EU researchers through the European Open Science Cloud. To ensure that the services meet the actual needs of the research community we invite both the demand and the supply side, to share their views and engage in a productive dialogue. Our aim is to capture the needs of EU researchers and inform the EO service providers so that they make available services that effectively address them. We will also explain how the OCRE process will work, how the different stakeholders should be involved and how to make the most of the foreseen benefits.
On 29 January 2020 ARCHIVER launched its Request for Tender with the purpose to award several Framework Agreements and work orders for the provision of R&D for hybrid end-to-end archival and preservation services that meet the innovation challenges of European Research communities, in the context of the European Open Science Cloud.
The tender was closed on 28 April 2020 and 15 R&D bids were submitted, with consortia that included 43 companies and organisations. The best bids have been selected and will start the first phase of the ARCHIVER R&D (Solution Design) in June 2020.
On Monday 8 June the selected consortia for the ARCHIVER design phase have been announced during a Public Award Ceremony starting at 14.00 CEST.
In light of the COVID-19 outbreak and the and consequent movement restrictions imposed in several countries, the event has been organised as a webinar, virtually hosted by Port d’Informació Científica (PIC), a member of the Buyers Group of the ARCHIVER consortium.
The Kick-off marks the beginning of the Solution Design Phase.
Presentation slides for SDCSB Cytoscape Workshop on 5/19/2016. The presentation contains current status of Cytoscape project and overview of the Cytoscape ecosystem. It briefly mentions the Cytoscape Cyberinfrastructure.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
7. Solving molecular puzzles
by computational docking
haddock.science.uu.nl
>10000 users worldwide
Used by major pharma companies
8. Haddock
web portal
• > 10500 registered users
• > 188000 served runs
since June 2008
• > 35% on the GRID
Visit bonvinlab.org/software
De Vries et al. Nature Prot. 2010
Van Zundert et al. J.Mol.Biol. 2016
16. # Number of dimensions 2
# INAME 1 1H
# INAME 2 1H
12 2.137 2.387 1 T 0.000e+00 0.00e+00 - 0 2756 2760 0
14 2.387 4.140 1 T 0.000e+00 0.00e+00 - 0 2760 2752 0
32 1.849 4.432 1 T 0.000e+00 0.00e+00 - 0 2259 2257 0
36 1.849 3.143 1 T 0.000e+00 0.00e+00 - 0 2259 2587 0
39 1.760 4.432 1 T 0.000e+00 0.00e+00 - 0 2260 2257 0
40 1.760 1.849 1 T 0.000e+00 0.00e+00 - 0 2260 2259 0
43 1.760 3.143 1 T 0.000e+00 0.00e+00 - 0 2260 2587 0
46 1.649 4.432 1 T 1.035e+05 0.00e+00 r 0 2583 2257 0
47 1.649 1.849 1 T 0.000e+00 0.00e+00 - 0 2583 2259 0
assign ( resid 501 and name OO )
( resid 501 and name Z )
( resid 501 and name X )
( resid 501 and name Y )
( resid 2 and name CA ) -0.1400 0.15000
assign ( resid 501 and name OO )
( resid 501 and name Z )
( resid 501 and name X )
( resid 501 and name Y )
( resid 3 and name CA ) -0.0100 0.15000
Data
interpretation
Structure, dynamics & interactions
è impact on research and health:
- origin of disease
- design of new experiments
- drug design…
Exploiting GRID resources in structural biology…
Computations
NMR data collection and processing SAXS data analysis
17. eScience hub for NMR and structural biology
Infrastructure
Science
Com
m
unity
Knowledge
The WeNMR VRC
19. WeNMR VRC (February 2018)
• enmr.eu: One of the largest (#users) VO in life sciences
• >830 users have registered so far(36% outside EU)
• Support from >40 sites for >200’000 CPU cores via EGI infrastructure
• User-friendly access to Grid via web portals
• Supported by an SLA (2016, updated in 2017) with EGI and NGIs
www.wenmr.eu
NMR
SAXSA worldwide
e-Infrastructure for NMR and
structural biology
22. Challenges & e-Solutions
§ Attract users!
§ Offer them top of the line eScience solutions for
their research ... which means top of the line
software
23. The WeNMR VRC
Knowledge
Help Center
Tutorials, Wiki
Consultancy
Services
Portals
VRC
Third-party aggregation
Grid
Exposure
Marketplace
Blogs, news,
events..
User
SSO
Facebook
• 39 web portals (31 NMR, 7 SAXS)
• of which 29 by partners
• Uniform access through the new
Single Sign On functionality
• RPC access available for some
portals
25. Challenges & e-Solutions
§ Attract users!
§ Offer them top of the line eScience solutions for
their research ... which means top of the line
software
§ Provide them training, tutorials and support
26. The WeNMR VRC
Knowledge
Help Center
Tutorials, Wiki
Consultancy
Services
Portals
VRC
Third-party aggregation
Grid
Exposure
Marketplace
Blogs, news,
events..
User
SSO
Facebook
• Help center
• Consultancy remote or on
location
• Tutorials, wiki documents,
movies
• YouTube channel
• Many workshops …
27. Challenges & e-Solutions
§ Attract users!
§ Offer them top of the line eScience solutions for
their research ... which means top of the line
softwares)
§ Provide them training, tutorials and support
§ Make their life easier
28. The WeNMR VRC
Knowledge
Help Center
Tutorials, Wiki
Consultancy
Services
Portals
VRC
Third-party aggregation
Grid
Exposure
Marketplace
Blogs, news,
events..
User
SSO
Facebook
33. Job management
§ Need to handle millions of job submission
§ Initially based on gLite
§ Mostly migrated to DIRAC4EGI
§ From a user perspective DIRAC is in principle
grid/cloud agnostic:
§ Can automatically launch VMs
§ Software distributed via CVMFS
35. European Open Science Cloud
CC
Under EGI-Engage
The eInfrastructure landscape over the years
36. § With activities toward:
§ Integrating the communities
§ Making best use of cloud resources
§ Bringing data to the cloud (cryo-EM)
§ Exploiting GPGPU resources
§ While maintaining the quality of our
current services!
The MoBrain CC under EGI Engage
43. Exploring GPGPU resources: PowerFit
• Python package to
automatically fit high-
resolution biomolecular
structures into cryo-EM
densities
• Simple command-line
program, able to run using
single/multiple CPUs or GPU
van Zundert and Bonvin. AIMS Biophysics 2, 73-87 (2015)
www.github.com/haddocking/powerfit
44. Exploring GPGPU resources: DisVis
• Python package to Python
package to visualize and
quantify the accessible
interaction space of distance
restrained binary biomolecular
complexes.
• Simple command-line
program, able to run using
single/multiple CPUs or GPU
van Zundert and Bonvin. Bioinformatics. 31, 3222-3224 (2015)
www.github.com/haddocking/disvis
46. Baremetal vs grid vs cloud
ID Type GPU #Cores CPU type Mem (GB)
B-K20 Baremetal Tesla K20 24 HT (12 real) Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 32
B-K40 Baremetal Tesla K40 48 HT (24 real) Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz 512
D-K20 Docker on K20 Tesla K20 24 Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 32
K-K40 KVM on K40 Tesla K40 24 Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz 32
Case Machine
TimeGPU
(sec)
TimeCPU 1
core CPU1/GPU
B-K40 Baremetal 674 7928 11.8
K-K40 KVM 671 7996 11.9
B-K20 Baremetal 830 11839 14.3
D- K20 Docker 837 11926 14.3
No loss of performance
CourtesyofMarioDavid
INDIGO
<= Grid
<= Cloud
47. GPGPU, GRID-enabled web portals
http://milou.science.uu.nl/enmr/services/DISVIS/ http://milou.science.uu.nl/enmr/services/POWERFIT/
48. Pre-processing
+
Input files
packaging
Architecture behind the portals
User DB
User not found
Input error
WEB CLIENT WEB SERVER MASTER NODE WORKING NODE
GPU-
calculation
Validation
Submission
to local
nodes
Submission
to grid
node
CPU-
calculation
Chimera
image
generation
Post-processing
+
Results formatting
Output files
packaging
+
submission of
image generation
OR
50. Some usage statsOperational since Aug. 2016
Published Dec. 2016 Top pulls in INDIGO applications docker hub
https://hub.docker.com/r/indigodatacloudapps/
54. Thematic services under EOSC-Hub
https://www.egi.eu/use-cases/scientific-applications-tools/
55. Thematic services under EOSC-Hub
§ Harvest both
§ DIRAC4EGI can handle both without the
additional burden of managing the cloud
VMs
§ We still have much more grid than cloud
resources
§ HADDOCK portal as use case in Helix Nebula
Science Cloud
56. The exascale challenge
Ø ~20’000 human proteins
Ø Hundreds of thousands of interactions
Ø Billions CPU hours and exabytes of data
Ø Need to make our software ready for it!
bioexcel.eu