This document discusses the potential for an open software platform for the Square Kilometre Array (SKA) radio telescope project. It notes the "data deluge" problem and sees the SKA Science Data Processor (SDP) compute model as a general case for distributed computing. It introduces TOPS, an open source distributed operating system being developed by Open Parallel for rack-scale computing. The document advocates starting with open source and OpenStack and asks for help developing an open software stack to address exascale challenges like power consumption, heterogeneous hardware, and software-defined systems.
Q: Can I simply hire one rockstar data scientist to cover all this kind of work?
A: No, interdisciplinary work requires teams
A: Hire leads who can speak the lingo of each required discipline
A: Hire individual contributors who cover 2+ roles, when possible
Statistical Thinking – Solve the Whole Problem
BONUS: Meta Organization – Integration with Adjacent Teams
Co-authors Allen Day @allenday and Paco Nathan @pacoid
BioITWorld 2013 presentation - Best practices for building multi-tenant HPC clusters for Pharma/BioTech
Essentially a mini case study of a recent deployment of a multi-petabyte, 1000+ CPU core Linux cluster in the Boston area.
Please email me at: chris@bioteam.net if you would like the actual PDF file itself.
In this deck from the HPC User Forum in Milwaukee, Tim Barr from Cray presents: Perspective on HPC-enabled AI.
"Cray’s unique history in supercomputing and analytics has given us front-line experience in pushing the limits of CPU and GPU integration, network scale, tuning for analytics, and optimizing for both model and data parallelization. Particularly important to machine learning is our holistic approach to parallelism and performance, which includes extremely scalable compute, storage and analytics."
Watch the video: https://wp.me/p3RLHQ-hpw
Learn more: http://cray.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Vertical is the New Horizontal - MinneAnalytics 2016 Sri Ambati Keynote on AISri Ambati
Data is the only vertical, Machine Learning, bigdata, artificial intelligence
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Q: Can I simply hire one rockstar data scientist to cover all this kind of work?
A: No, interdisciplinary work requires teams
A: Hire leads who can speak the lingo of each required discipline
A: Hire individual contributors who cover 2+ roles, when possible
Statistical Thinking – Solve the Whole Problem
BONUS: Meta Organization – Integration with Adjacent Teams
Co-authors Allen Day @allenday and Paco Nathan @pacoid
BioITWorld 2013 presentation - Best practices for building multi-tenant HPC clusters for Pharma/BioTech
Essentially a mini case study of a recent deployment of a multi-petabyte, 1000+ CPU core Linux cluster in the Boston area.
Please email me at: chris@bioteam.net if you would like the actual PDF file itself.
In this deck from the HPC User Forum in Milwaukee, Tim Barr from Cray presents: Perspective on HPC-enabled AI.
"Cray’s unique history in supercomputing and analytics has given us front-line experience in pushing the limits of CPU and GPU integration, network scale, tuning for analytics, and optimizing for both model and data parallelization. Particularly important to machine learning is our holistic approach to parallelism and performance, which includes extremely scalable compute, storage and analytics."
Watch the video: https://wp.me/p3RLHQ-hpw
Learn more: http://cray.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Vertical is the New Horizontal - MinneAnalytics 2016 Sri Ambati Keynote on AISri Ambati
Data is the only vertical, Machine Learning, bigdata, artificial intelligence
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
This is a custom "Bio IT trends/problems" deck that I did for a general but highly technical audience at the 2014 Internet2 Technology Exchange conference.
Download of the raw PPT is disabled; contact me at chris@bioteam.net if a direct copy or PDF of the presentation would be useful.
In this RichReport slidecast, Dr. Nick New from Optalysys describes how the company's optical processing technology delivers accelerated performance for FFTs and Bioinformatics.
"Our prototype is on track to achieve game-changing improvements to process times over current methods whilst providing high levels of accuracy that are associated with the best software processes.”
Watch the video: https://wp.me/p3RLHQ-hn0
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Tomasz Bednarz
Presented at the ACEMS workshop at QUT in February 2015.
Credits: whole project team (names listed in the first slide).
Approved by CSIRO to be shared externally.
In this deck from ISC 2015, Earl Joseph and Bob Sorensen from IDC provide an HPC Market Update.
Watch the video presentation: http://wp.me/p3RLHQ-emz
Sign up for our insideHPC Newsletter: http://insideHPC.com/newsletter
NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world.
From Lab to Factory: Or how to turn data into valuePeadar Coyle
We've all heard of 'big data' or data science, but how do we convert these trends into actual business value. I share case studies, and technology tips and talk about the challenges of the data science process. This is all based on two years of in-the-field research of deploying models, and going from prototypes to production.
These are slides from my talk at PyCon Ireland 2015
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...Revolution Analytics
Statistical analysis has been known to be invaluable to any manufactory’s quality assurance for decades. Recently the value of valid statistical analysis has also been demonstrated to radically improve the ability of a company’s ability to weather extreme peaks and valley in customer demand. John Deere has been able to adjust to commodity spikes and housing downturns much better than its competitors have. This is in part due to the implementation of statistical analysis and the use of R software in the order fulfillment function of John Deere.
How Do I Understand Deep Learning Performance?NVIDIA
Introduced at GTC 2018, PLASTER outlines critical problems with machine learning. Learn how to address and tackle these problems to better deliver AI-based services.
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
Getting from raw data to deploying data-driven solutions requires technology, data, and people. All of which exist. So why aren’t we seeing more truly data-driven companies: what's missing and why? During Strata Hadoop World Singapore 2015, Pauline Brown, Director of Marketing at Dataiku, explains how lack of collaboration is what is keeping companies from building and deploying data products effectively. Learn more about Dataiku and Data Science Studio: www.dataiku.com
This presentation covers how deep learning is transforming industries; our role in key markets such as VR, robotics, and self-driving cars; and our culture of craftsmanship, giving, and learning. This also includes highlights on how we are driving the transformations in gaming through GeForce GTX GPUs and the GeForce Experience, and how we’re helping accelerate scientific discovery through GPU computing and our long-term commitment to CUDA architecture.
Vertex Perspectives | AI Optimized Chipsets | Part IVVertex Holdings
In this instalment, we delve into other emerging technologies including neuromorphic chips and quantum computing systems, to examine their promise as alternative AI-optimized chipsets.
Adoption of the R language has grown rapidly in the last few years, and is ranked as the number-one data science language in several surveys. This accelerating R adoption curve has been driven by the Big Data revolution, and the fact that so many data scientists — having learned R at university — are actively unlocking the secrets hidden in these new, vast data troves. In more than 6 years of writing for the Revolutions blog, I’ve discovered hundreds of applications of R in business, in government, and in the non-profit sector. Sometimes the use of R is obvious, and sometimes it takes a little bit of detective work to learn how R is operating behind the scenes. In this talk, I'll recount some of my favourite applications of R, and show how R is behind some amazing innovations in today’s world.
2014 BioIT World - Trends from the trenches - Annual presentationChris Dagdigian
Talk slides from the annual "trends from the trenches" address at BioITWorld Expo. 2014 Edition.
### Email chris@bioteam.net if you'd like a PDF copy of this deck ###
This is a custom "Bio IT trends/problems" deck that I did for a general but highly technical audience at the 2014 Internet2 Technology Exchange conference.
Download of the raw PPT is disabled; contact me at chris@bioteam.net if a direct copy or PDF of the presentation would be useful.
In this RichReport slidecast, Dr. Nick New from Optalysys describes how the company's optical processing technology delivers accelerated performance for FFTs and Bioinformatics.
"Our prototype is on track to achieve game-changing improvements to process times over current methods whilst providing high levels of accuracy that are associated with the best software processes.”
Watch the video: https://wp.me/p3RLHQ-hn0
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Tomasz Bednarz
Presented at the ACEMS workshop at QUT in February 2015.
Credits: whole project team (names listed in the first slide).
Approved by CSIRO to be shared externally.
In this deck from ISC 2015, Earl Joseph and Bob Sorensen from IDC provide an HPC Market Update.
Watch the video presentation: http://wp.me/p3RLHQ-emz
Sign up for our insideHPC Newsletter: http://insideHPC.com/newsletter
NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world.
From Lab to Factory: Or how to turn data into valuePeadar Coyle
We've all heard of 'big data' or data science, but how do we convert these trends into actual business value. I share case studies, and technology tips and talk about the challenges of the data science process. This is all based on two years of in-the-field research of deploying models, and going from prototypes to production.
These are slides from my talk at PyCon Ireland 2015
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...Revolution Analytics
Statistical analysis has been known to be invaluable to any manufactory’s quality assurance for decades. Recently the value of valid statistical analysis has also been demonstrated to radically improve the ability of a company’s ability to weather extreme peaks and valley in customer demand. John Deere has been able to adjust to commodity spikes and housing downturns much better than its competitors have. This is in part due to the implementation of statistical analysis and the use of R software in the order fulfillment function of John Deere.
How Do I Understand Deep Learning Performance?NVIDIA
Introduced at GTC 2018, PLASTER outlines critical problems with machine learning. Learn how to address and tackle these problems to better deliver AI-based services.
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
Getting from raw data to deploying data-driven solutions requires technology, data, and people. All of which exist. So why aren’t we seeing more truly data-driven companies: what's missing and why? During Strata Hadoop World Singapore 2015, Pauline Brown, Director of Marketing at Dataiku, explains how lack of collaboration is what is keeping companies from building and deploying data products effectively. Learn more about Dataiku and Data Science Studio: www.dataiku.com
This presentation covers how deep learning is transforming industries; our role in key markets such as VR, robotics, and self-driving cars; and our culture of craftsmanship, giving, and learning. This also includes highlights on how we are driving the transformations in gaming through GeForce GTX GPUs and the GeForce Experience, and how we’re helping accelerate scientific discovery through GPU computing and our long-term commitment to CUDA architecture.
Vertex Perspectives | AI Optimized Chipsets | Part IVVertex Holdings
In this instalment, we delve into other emerging technologies including neuromorphic chips and quantum computing systems, to examine their promise as alternative AI-optimized chipsets.
Adoption of the R language has grown rapidly in the last few years, and is ranked as the number-one data science language in several surveys. This accelerating R adoption curve has been driven by the Big Data revolution, and the fact that so many data scientists — having learned R at university — are actively unlocking the secrets hidden in these new, vast data troves. In more than 6 years of writing for the Revolutions blog, I’ve discovered hundreds of applications of R in business, in government, and in the non-profit sector. Sometimes the use of R is obvious, and sometimes it takes a little bit of detective work to learn how R is operating behind the scenes. In this talk, I'll recount some of my favourite applications of R, and show how R is behind some amazing innovations in today’s world.
2014 BioIT World - Trends from the trenches - Annual presentationChris Dagdigian
Talk slides from the annual "trends from the trenches" address at BioITWorld Expo. 2014 Edition.
### Email chris@bioteam.net if you'd like a PDF copy of this deck ###
A Unique and Life Changing Open Business Opportunity.Vilas Gedam
हॅलो दोस्त,
भारत कि और से शुभकामनाएँ।
एक अनोखा और जिंदगी बदल देने वाले खुले व्यवसाय की संधि ।
दोस्त आप अपने नौकरी, प्रोफेशन, व्यवसाय और पढाई को बिना छोड़े और खाली समय में काम करके अपने आमदनी को बढाना चाहते हो ।
मैं ऐसे ही कुछ लोगों के साथ मिलकर काम कर रहा हूं जो बहुत ही महत्वाकांक्षी और खूद से उत्साहित हैं जो खाली समय में अतिरिक्त काम करके अपने आमदनी को बढाना चाहते हैं ।
अपने घर से ही और खाली समय में काम करके बेह्तरीन आमदनी कमाने का जरिया ।
पहिला पड़ाव :
(सिखने की प्रक्रिया)
आप पुरे दिलोजानसे काम करते हो 6 से 18 महिने तो आप बनेंगे प्लॅटिनम और आप की आमदनी होगी रू. 70000 से रू. 85000+ प्रति महीना।
दुसरा पड़ाव:
(सिखाने की प्रक्रिया)
आप पुरे दिलोजानसे काम करते हो 2 से 3 साल और 3 लोगोंकि मदत करते हो प्लॅटिनम बनने के लिए तो आप बनेंगे एमरल्ड और आप की आमदनी होगी रू. 150000 से रू. 200000 प्रति महीना।
तिसरा पड़ाव:
(डूप्लिकेशन)
2-5 Years BIG PLAN
आप 6 लोगों की मदत करते हो प्लॅटिनम जाने के लिए 3 से 5 साल में तो आप बनेंगे डायमंड और आप की आमदनी होगी रू. 300000 से रू. 500000 प्रति महीना। यह पैसा आप के तीन पिढीयों तक मिलता हैं रायल्टी के जरिए।
यह आपकी जिंदगी हैं। यह व्यवसाय की संधि आपके लिए सौभाग्य की बात हैं।
अतिरिक्त जानकारी के लिए यह लिंक ओपन करके पुरा प्रेजेंटेशन ध्यान से देखना।
Watch "Show The Plan by Mr.Nitesh Di
Some organizations hire 60-65% of talent from employee references and many struggle to at 15% employee contribution in overall hiring. Every wonder why?
You may go through these 6 Mantras which will help you in having a complete insight and plan your Big Jump in Recruitment.
Tom Soderstrom, Chief Technology and Innovation Officer at NASA’s Jet Propulsion Laboratory, has demonstrated how internet-of-things (IoT) technology and cloud computing can form the backbone for monumental innovation. This combination has enabled private and public space exploration enterprises to dare greatly and, together, discover more of the solar system than ever before. Cloud computing, with its unlimited storage and compute resources, blends IoT, machine learning, intelligent assistance, and new interfaces with computers. It has the potential to allow humans to explore and colonize other areas of the solar system by enabling collaboration across millions of miles, and social networking on a planetary scale.
The Environment for Innovation: Tristan Goode, AptiraOpenStack
The Environment for Innovation
Audience: Beginner
Topic: User Stories
Abstract: What is OpenStack? Who uses OpenStack? How can OpenStack help Telco’s, ISP’s and Operators? What challenges are on the way, and what can you do?
This talk will discuss the benefits of OpenStack, with examples from some of the largest global companies currently using this platform. It will also cover a roadmap to identify new projects coming to the market (particularly around SDN and NFV), and the growing maturity of OpenStack and beyond.
Tristan will show the audience how to commit to a software defined strategy, how to build an innovation lab with a customer focussed partner, and more. This presentation will feature real world insights, industry leading trends, and use cases from compliance oriented cloud platforms with high compliance requirements.
Speaker Bio: Tristan Goode, Aptira
Over 25 years’ experience in the IT industry has given Tristan a solid reputation as an innovative architect in systems infrastructure and enterprise solutions. Forward-thinking with strong attention to detail, Tristan has been responsible for designing, implementing and maintaining solutions for the likes of BTR Nylex, NEC, OzEmail, Intel, and iPrimus.
Tristan has an unwavering commitment to exceed expectations for both the business and the customer, and is driven by his personal desire to create unique, effective solutions for any requirements Aptira’s customers may present. Tristan is a founding and 4 times elected Board Director of the OpenStack Foundation, an OpenStack Ambassador, and the founder of the Australian OpenStack User Group.
OpenStack Australia Day Government - Canberra 2016
https://events.aptira.com/openstack-australia-day-canberra-2016/
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingPaco Nathan
London Spark Meetup 2014-11-11 @Skimlinks
http://www.meetup.com/Spark-London/events/217362972/
To paraphrase the immortal crooner Don Ho: "Tiny Batches, in the wine, make me happy, make me feel fine." http://youtu.be/mlCiDEXuxxA
Apache Spark provides support for streaming use cases, such as real-time analytics on log files, by leveraging a model called discretized streams (D-Streams). These "micro batch" computations operated on small time intervals, generally from 500 milliseconds up. One major innovation of Spark Streaming is that it leverages a unified engine. In other words, the same business logic can be used across multiple uses cases: streaming, but also interactive, iterative, machine learning, etc.
This talk will compare case studies for production deployments of Spark Streaming, emerging design patterns for integration with popular complementary OSS frameworks, plus some of the more advanced features such as approximation algorithms, and take a look at what's ahead — including the new Python support for Spark Streaming that will be in the upcoming 1.2 release.
Also, let's chat a bit about the new Databricks + O'Reilly developer certification for Apache Spark…
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture Daryna Dubitska
Big Data - it’s a huge amount of information which use unique instruments, methods, treatment approach, and system analysis.
SoftElegance can offer you all spectre of Big Data Services!
OpenStack - What is it and why you should know about it!OpenStack
A presentation I did to the inaugural CompCon at ANU in Canberra 29/09/2013.In a phenomenally short time OpenStack has risen to be the dominant platform for building private and public clouds of any scale. With 1000s of contributors and hundreds of companies backing the project, Tristan will demonstrate why you need to know about OpenStack and get involved now.
- What is OpenStack
- History of the project
- Phenomenal growth of the project
- Relevance in Australia and internationally, presenting opportunities to build green field clouds the world over.
- Massive job demand
In this deck from the 2019 Stanford HPC Conference, Nik Nystrom from the Pittsburgh Supercomputing Center presents: Pioneering and Democratizing Scalable HPC+AI.
"PSC's Bridges was the first system to successfully converge HPC, AI, and Big Data. Designed for the U.S. national research community and supported by NSF, Bridges now serves approximately 1800 projects and 7500 users at 380 institutions, and it is the foundation around which new HPC+AI projects have launched. Bridges emphasizes "nontraditional" uses that span the life, physical, and social sciences, computer science, engineering, business, and humanities. Scalable HPC+AI is driving many of those applications, which span diverse topics such as learning root causes of cancer, strategic reasoning, designing new materials, predicting severe storms, recognizing speech including contextual information, and detecting objects in 4k streaming video. To address the demand for scalable AI, PSC recently introduced Bridges-AI, which adds transformative new AI capability. In this presentation, we share our vision in designing HPC+AI systems at PSC and highlight some of the exciting research breakthroughs they are enabling."
Nick Nystrom is Interim Director and Sr. Director of Research at the Pittsburgh Supercomputing Center (PSC). Nick is architect and PI for Bridges, PSC's flagship system that successfully pioneered the convergence HPC, AI, and Big Data. He is also PI for the NIH Human Biomolecular Atlas Program’s HIVE Infrastructure Component and co-PI for projects that bring emerging AI technologies to research (Open Compass), apply machine learning to biomedical data for breast and lung cancer (Big Data for Better Health), and identify causal relationships in biomedical big data (the Center for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence). His current research interests include hardware and software architecture, applications of machine learning to multimodal data (particularly for the life sciences) and to enhance simulation, and graph analytics.
Watch the video: https://youtu.be/ucRs4A_afus
Learn more: https://www.psc.edu/bridges
and
http://hpcadvisorycouncil.com/events/2019/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The Future is Big Graphs: A Community View on Graph Processing SystemsNeo4j
Alexandru Iosup, Full Professor, Vrije Universiteit Amsterdam (VU Amsterdam)
Angela Bonifati, Full Professor of Computer Science, Université de Lyon
Hannes Voigt, Software Engineer, Neo4j
We have the Bricks to Build Cloud-native Cathedrals - But do we have the mortar?Nane Kratzke
This is some input for a panel discussion about "Challenges of Cloud Computing-based Systems" I attend at the 9th International Conference on Cloud Computing, GRIDs, and Virtualization (CLOUD COMPUTING 2018) in Barcelona, Spain in February 2018.
Cloud-native applications (CNA) are build more and more often according to microservice and independent system architecture (ISA) approaches. ISA involves two architecture layers: the macro and the micro architecture layer. Software engineering outcomes on the micro layer are often distributed in a standardized form as self-contained deployment units (so called container images). There exist plenty of programming languages to implement these units: JAVA, C, C++, JavaScript, Python, R, PHP, Ruby, ... (this list is almost endless) But on the macro layer, one might mention TOSCA and little more. TOSCA is an OASIS deployment and orchestration standard language to describe a topology of cloud based web services, their components, relationships, and the processes that manage them. This works for static deployments. However, CNA are elastic, self-adaptive - almost the exact opposite of what can be defined efficiently using TOSCA. For these kind of scenarios one might mention Kubernetes or Docker Swarm as container orchestrators which are intentionally build to operate elastic services formed of containers. But these operating platforms do not provide expressive and pragmatic programming languages covering the macro layer of cloud-native applications.
So it seems there is a gap and the question arises, whether we need further (and what kind of) macro layer languages for CNA?
Similar to SKA_in_Seoul_2015_NicolasErdody v2.0 (20)
We have the Bricks to Build Cloud-native Cathedrals - But do we have the mortar?
SKA_in_Seoul_2015_NicolasErdody v2.0
1. An Open Software Platform
for the SKA?
Nicolás Erdödy
Founder, CEO – Open Parallel Ltd
SKA in Seoul:
Asia-Pacific Regional Workshop in HI Science
Seoul, Korea - November 2, 2015
2.
3. Brief
● The Problem: “data deluge”
● The Opportunity: We see the SKA SDP
compute model as the general case
● TOPS - A Distributed OS for Rack Scale
Computing.
● How to start: Open Source & Open Stack
● We need your help...
4. Efficient recognition of signals
from a massive amount
of data noise
improves operational efficiencies,
scientific discovery and
forms the cradle of
adaptive service delivery.
5. As today's HPC
becomes tomorrow's
Cloud computing platform
it will enable a wider application of
Machine Understanding
-the near real-time
complex modelling
and analysis of data
that leads to insight
and faster decisions.
6.
7. Today's problems and beyond
● Non-professional software development (in
many scientific environments) lead to limited or
null software stack reuse.
● Data deluge (44 ZettaBytes by 2020 – IDC).
● The exascale challenge: 10^18 calculations p/s
● Power consumption.
● Heterogeneous hardware.
● Compute Islands?
● Software Defined Everything (SDN, SDI, SDS).
8. SDP Preliminary Compute Platform
Design (*)
● Quite different than on a general-purpose
supercomputer
● Workload-driven system design philosophy to
tune SDP hardware.
● SDP Compute Islands - “self-contained,
independent collection of compute nodes”.
● Only process data contained in the island itself.
● (*) Broekema, van Nieuwpoort, Bal (July 2015)
9. TOPS – What are we doing
● Conceived as a Rack Scale distributed
Operating System for the Data Centre.
● TOPS workshop #2 (Multicore World 2016, Wellington, NZ)
● CSP's Software Development Plan.
● Panel “Towards an Open Software Stack for
Exascale Computing” at SuperComputing15 – Austin,
Texas, USA (15-20 Nov).
● OpenStack - South Africa - 2015 CHPC conference,
Pretoria (1-4 Dec)
10. “Towards an Open Software Stack for
Exascale Computing” (SC15 – 19Nov – Austin, USA ).
● Prof. Jack Dongarra (Tennessee, Turing Fellow –
Manchester, scientific advisory board for SKA, LINPACK).
● Prof. Thomas Sterling (Indiana, Centre for Research
Extreme Scale Technologies, Beowulf clusters, MCW15).
● Dr. Pete Beckman (Exascale Technology & Computing
Institute, Argonne Labs – Chicago, Argo OS).
● Dr. John Gustafson (fmr AMD Chief Product Architect,
Director Intel Labs, Sun, Gustafson's Law, MCW14).
● Dr. Robert Wisniewski (Chief Software Architect Exascale
Computing, Intel -formerly Chief Software Architect Blue
Gene Supercomputer, IBM).
● Chris Broekema (SDP COMP Task Leader, ASTRON,
Netherlands).
11. Your input
● What should TOPS be / do for you?
● Let's start a chat -this is a 2-5 years
conversation.
● Thank you!
● OpenParallel.com
● MulticoreWorld.com
● Nicolas.Erdody@openparallel.com
● Oamaru, South Island, New Zealand
12. The data deluge will change
how we build and manage
new systems to store
and understand data
13. “This time, we have time”
a) How should software evolve to address exascale demands? Are OpenStack
or other platforms part of the solution? Algorithms should evolve, and most
legacy software will be replaced: so what should be the focus of the new ones?
To save power? To increase speed? To improve programmability?
b) How heterogeneous would/should “your” exascale system be? Is there a role
for Co-design towards exascale?
c) The SKA project is an example where once it becomes operational, exascale
problems will appear very early. But venture capitalists don't invest in radio-
telescopes. What killer app would attract them towards early adoption of
exascale computing? Which industries will migrate first?
d) Would HPC in the cloud be possible for exascale computing? Which
technologies do we need to change / challenge to make it feasible? Data
transport? Servers? What are those technologies most important for your work?
e) Do you envisage a similar development effort as we had with OSS over
decades, or will bottlenecks develop due to lack of specialised talent globally?
Will proprietary solutions continue to emerge or co-exist? Who will “own” the
exascale era? Microsoft? Google? Will there be competition between existing
companies and “not yet founded” start-ups, or will each organisation have its
own in-house development shop?
14. Open Parallel Ltd.
● NZ Company – involved with SKA since 2011.
● 3 NZ organisations (AUT, VUW & OP) were
formally pre-selected in 2012 by the NZ Govt
-after international peer-review, as viable
prospects for engagement in SDP and CSP.
● Since 2013 Open Parallel is formally:
- Work Package Manager of the Software
Development Environment for the CSP,
- Contributing to SDP Compute Platform,
- Member of the NZ SKA Alliance (lead by AUT
university).
15. OP's work for the SKA
What's done (2013 - 2015)
Version 1 of “SKA CSP Element Software Development Plan”
(SE-23). How the CSP element “will develop and deliver software
and/or firmware in accordance to a design specification.”
Incorporated into SDP’s Architecture Reference Document (2014)
and referenced in SDP’s “Compute Platform: Software stack
developments and considerations” (SDP’s PDR 2015).
To be fully delivered over Stage 2 timeframe (2015 - 2017).
Most recent task: provide CSP Consortium with SW/FW process
requirements to support the effective re-use of SW and FW
developed during pre- construction for construction.
Note:
CSP = Central Signal Processor
SDP = Science Data Processor
16. Could SKA's IT be a Black Swan?
• “Black Swan” = high-impact events that are rare and
unpredictable but in retrospect seem not so
improbable.
• One in six IT projects (is) a black swan, with a cost
overrun of 200%, on average (*).
• Developers struggle to combine different software
systems.
• 61% of managers report major conflicts between
project and line organisations.
• (*) “Why your IT Project may be riskier than you think”. B.
Flyvbjerg et al. HBR, Sept. 2011.
17. What is the SKA?
● The world's largest radio telescope
● The ultimate big data project
● The largest supercomputer in the world
● A technological management challenge
and...
● The general case of future HPC + Cloud...
18. Our world is full of data
● “Every year we collect more data than the
rest of the data collected since the
beginning of the Mankind”
(Prof. Alex Szalay, Johns Hopkins University. TEDx
Caltech 2011 – Keynote at Multicore World 2016).
● Exponentially faster computing + successive
generations of inexpensive sensors + you on
your smartphone sharing all those images.
● Data intensive science, synthesizing theory
(equations), experiments and computation with
analytics → new way of thinking is required!