SlideShare a Scribd company logo
1 of 8
Opinions on the State of
Production Distributed
Infrastructure (PDI)
Daniel S. Katz (dsk@ci.uchicago.edu)
Senior Fellow, Computation Institute (University
of Chicago & Argonne National Laboratory)

(all the following are personal opinions only)




                                                   www.ci.anl.gov
                                                   www.ci.uchicago.edu
State of the Art in PDI
•   PDI - a distributed set of computational hardware and software that is
    intended to allow multiple people who are not the developers of the
    infrastructure to do something useful
•   Three types of PDI exist:
      –   Academic/Public Production (aimed at science)
             o   TeraGrid/XSEDE, DEISA (HPC), OSG, EGI (HTC), Open Science Data Cloud (cloud), etc.
      –   Academic/Public Research (aimed at computer science)
             o Grid5000, DAS, FutureGrid, PlanetLab, etc.
             o Open question – how to transition lessons to production
      –   Commercial (aimed at whoever will pay)
             o   Amazon (IaaS), Microsoft (Paas), perhaps SGI, Penguin (HPCaaS)
•   In addition, lots of other components that may not be integrated together:
      –   Campus clusters (HPC), campus Condor pools (HTC)
•   All seem to do what they were designed to do reasonable well
•   Note: this view is mostly by funding and purpose
•   Note: Could also view by interface:
      –   Command-line, grid (Globus, Condor, etc.), cloud


                                                                                              www.ci.anl.gov
2   Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu
                                                                                              www.ci.uchicago.edu
User View of PDI
•   Need to answer:
      –   Application developers:
             o   How do I develop an applications?
                   – What is the model?
             o   How do I deploy an application?
                   – Particularly hard for HPC
      –   Applications users:
             o   How do I find an application?
      –   Application builders:
             o   How do I compose an application?
                   – First need to find components...
      –   In all cases:
             o   How do I execute an application?

                                                                www.ci.anl.gov
3   Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu
                                                                www.ci.uchicago.edu
Open Challenges (in R&D, code, support,
and/or policy)
•   Goal – deliver maximum science (at minimum cost?)
      –   Note – sustainability seems to be a hot topic, but it
          seems to be defined as: useful work should continue to
          be done, with someone else paying for it
•   2 views of this?
      –   What are the components of infrastructure?
             o   Hardware (nodes, interconnect, storage, network), software
                 (system software, middleware, tools/libraries, applications),
                 training (material, people), support (people), integration into
                 PDIs
      –   What’s the vision of “the” PDI?
             o For those who build the infrastructure components, what is the
               architecture?
             o For applications people, what is the abstraction of the PDI?
                                                                         www.ci.anl.gov
4   Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu
                                                                         www.ci.uchicago.edu
Open Challenges (in
    R&D, code, support, and/or policy) (1)
•       Issues
        –   How measure delivered science
              o   We don’t really know how to do this – we can measure papers (w/
                  a small time delay) or citations (w/ longer time delay)
        –   How to develop the infrastructure/tools that will best do
            this
              o   I think we are doing reasonably well at this at an individual
                  infrastructure/tool level—we identify things that users want to
                  do, then provide tools/systems that let them do these things—
                  but it’s not clear that we ever solve the whole integrated problem
        –   How to integrate them
              o We can do this on a piece-by-piece basis, but don’t really know
                how to do this in general (maybe because we don’t have a single
                vision for where we are going)
              o Note that if we can do this, we can then allow the pieces to
                compete
                                                                          www.ci.anl.gov
    5    Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu
                                                                          www.ci.uchicago.edu
Open Challenges (in R&D, code, support,
    and/or policy) (2)
•       Issues
        –   How to represent users
              o   Large projects have the ability and opportunity to express their
                  needs; small projects and individual users are often not
                  represented
        –   Role of virtualization in HPC
              o HPC executables are not portable now, but could be with
                virtualization
              o But virtual HPC applications don’t perform well (I/O, network
                issues)
        –   How to build the application programming system that
            matches the abstract PDI?
              o   We’ve had a nice run of ~20 years with the MPI model, which
                  assumes a platform abstraction of interconnected nodes, each
                  with CPUs and memory

                                                                           www.ci.anl.gov
    6    Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu
                                                                           www.ci.uchicago.edu
Path Forward
•   Greatest need is a single vision that defines the overall goal
      –   NSF has CIF21 (http://www.nsf.gov/pubs/2010/nsf10015/nsf10015.pdf)
          which calls for such a vision, and 6 tasks forces that have created reports
      –   DOE seems more focused on single large systems, and less on integrating
          them into an infrastructure
      –   Some other US agencies looking at clouds, but not distribute computing
          in general
      –   I haven’t seen a European vision that includes integration of all PDIs,
          though the UMD implies one for HPC and Grids
•   and is flexible enough to allow groups to work towards that goal in
    various ways
      –   Should define metrics (to enable progress to be measured and see which
          work is most useful)
      –   And an overall architecture with interfaces
             o   OGF is trying to do some of this, but without sufficient US buy-in
•   DARPA funded effort in understanding user productivity in HPCS – but
    no one is doing this for PDIs

                                                                                      www.ci.anl.gov
7   Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu
                                                                                      www.ci.uchicago.edu
Acknowledgements
•   First slide (and probably some other thinking) is
    based on chapter 3 of upcoming book:
      –   S. Jha, D. S. Katz, M. Parashar, O. Rana, and J. Weissman.
          Abstractions for Distributed Applications and Systems.
          Wiley.
•   And technical report that’s coming sooner:
      –   D. S. Katz, S. Jha, M. Parashar, O. Rana, and J. Weissman.
          Distributed Cyberinfastructures. Technical Report.
          Computation Institute, University of Chicago & Argonne
          National Laboratory, URL/number TBD
•   And related eSI research themes (DPA and 3DPAS)
                                                                www.ci.anl.gov
8   Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu
                                                                www.ci.uchicago.edu

More Related Content

Viewers also liked

Advancing Science through Coordinated Cyberinfrastructure
Advancing Science through Coordinated CyberinfrastructureAdvancing Science through Coordinated Cyberinfrastructure
Advancing Science through Coordinated CyberinfrastructureDaniel S. Katz
 
Massachusetts Tidelands Law
Massachusetts Tidelands LawMassachusetts Tidelands Law
Massachusetts Tidelands Lawjoecal
 
NSF SI2 program discussion at 2013 SI2 PI meeting
NSF SI2 program discussion at 2013 SI2 PI meetingNSF SI2 program discussion at 2013 SI2 PI meeting
NSF SI2 program discussion at 2013 SI2 PI meetingDaniel S. Katz
 
Software and Education at NSF/ACI
Software and Education at NSF/ACISoftware and Education at NSF/ACI
Software and Education at NSF/ACIDaniel S. Katz
 
Perspectives on Undergraduate Education in Parallel and Distributed Computing
Perspectives on Undergraduate Education in Parallel and Distributed ComputingPerspectives on Undergraduate Education in Parallel and Distributed Computing
Perspectives on Undergraduate Education in Parallel and Distributed ComputingDaniel S. Katz
 
20160607 citation4software panel
20160607 citation4software panel20160607 citation4software panel
20160607 citation4software panelDaniel S. Katz
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsDaniel S. Katz
 

Viewers also liked (7)

Advancing Science through Coordinated Cyberinfrastructure
Advancing Science through Coordinated CyberinfrastructureAdvancing Science through Coordinated Cyberinfrastructure
Advancing Science through Coordinated Cyberinfrastructure
 
Massachusetts Tidelands Law
Massachusetts Tidelands LawMassachusetts Tidelands Law
Massachusetts Tidelands Law
 
NSF SI2 program discussion at 2013 SI2 PI meeting
NSF SI2 program discussion at 2013 SI2 PI meetingNSF SI2 program discussion at 2013 SI2 PI meeting
NSF SI2 program discussion at 2013 SI2 PI meeting
 
Software and Education at NSF/ACI
Software and Education at NSF/ACISoftware and Education at NSF/ACI
Software and Education at NSF/ACI
 
Perspectives on Undergraduate Education in Parallel and Distributed Computing
Perspectives on Undergraduate Education in Parallel and Distributed ComputingPerspectives on Undergraduate Education in Parallel and Distributed Computing
Perspectives on Undergraduate Education in Parallel and Distributed Computing
 
20160607 citation4software panel
20160607 citation4software panel20160607 citation4software panel
20160607 citation4software panel
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and Metrics
 

Similar to Opinions on the State of Production Distributed Infrastructure (PDI)

Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesDaniel S. Katz
 
NSF SI2 program discussion at 2014 SI2 PI meeting
NSF SI2 program discussion at 2014 SI2 PI meetingNSF SI2 program discussion at 2014 SI2 PI meeting
NSF SI2 program discussion at 2014 SI2 PI meetingDaniel S. Katz
 
Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Daniel S. Katz
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainDaniel S. Katz
 
Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Daniel S. Katz
 
Funding Software in Academia
Funding Software in AcademiaFunding Software in Academia
Funding Software in AcademiaDaniel S. Katz
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube
 
Big Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsBig Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsGeoffrey Fox
 
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNADaniel S. Katz
 
Pathways to Technology Transfer and Adoption: Achievements and Challenges
Pathways to Technology Transfer and Adoption: Achievements and ChallengesPathways to Technology Transfer and Adoption: Achievements and Challenges
Pathways to Technology Transfer and Adoption: Achievements and ChallengesTao Xie
 
On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...Grial - University of Salamanca
 
ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)John Towns
 
A Space X Industry Day Briefing 7 Jul08 Jgm R4
A Space X Industry Day Briefing 7 Jul08 Jgm R4A Space X Industry Day Briefing 7 Jul08 Jgm R4
A Space X Industry Day Briefing 7 Jul08 Jgm R4jmorriso
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?Daniel S. Katz
 
Built to grow: scalability factors to consider before commencing your next di...
Built to grow: scalability factors to consider before commencing your next di...Built to grow: scalability factors to consider before commencing your next di...
Built to grow: scalability factors to consider before commencing your next di...Marcus Emmanuel Barnes
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationMathieu d'Aquin
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) SkillsOscar Corcho
 

Similar to Opinions on the State of Production Distributed Infrastructure (PDI) (20)

Summary of 3DPAS
Summary of 3DPASSummary of 3DPAS
Summary of 3DPAS
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community Responses
 
NSF SI2 program discussion at 2014 SI2 PI meeting
NSF SI2 program discussion at 2014 SI2 PI meetingNSF SI2 program discussion at 2014 SI2 PI meeting
NSF SI2 program discussion at 2014 SI2 PI meeting
 
Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)
 
Data-X-v3.1
Data-X-v3.1Data-X-v3.1
Data-X-v3.1
 
Data-X-Sparse-v2
Data-X-Sparse-v2Data-X-Sparse-v2
Data-X-Sparse-v2
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to Sustain
 
Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)
 
Funding Software in Academia
Funding Software in AcademiaFunding Software in Academia
Funding Software in Academia
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013
 
Big Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsBig Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other things
 
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNA
 
Pathways to Technology Transfer and Adoption: Achievements and Challenges
Pathways to Technology Transfer and Adoption: Achievements and ChallengesPathways to Technology Transfer and Adoption: Achievements and Challenges
Pathways to Technology Transfer and Adoption: Achievements and Challenges
 
On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...
 
ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)
 
A Space X Industry Day Briefing 7 Jul08 Jgm R4
A Space X Industry Day Briefing 7 Jul08 Jgm R4A Space X Industry Day Briefing 7 Jul08 Jgm R4
A Space X Industry Day Briefing 7 Jul08 Jgm R4
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
 
Built to grow: scalability factors to consider before commencing your next di...
Built to grow: scalability factors to consider before commencing your next di...Built to grow: scalability factors to consider before commencing your next di...
Built to grow: scalability factors to consider before commencing your next di...
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education Organisation
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) Skills
 

More from Daniel S. Katz

Research software susainability
Research software susainabilityResearch software susainability
Research software susainabilityDaniel S. Katz
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSADaniel S. Katz
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonDaniel S. Katz
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Daniel S. Katz
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsDaniel S. Katz
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...Daniel S. Katz
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainabilityDaniel S. Katz
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and PracticeDaniel S. Katz
 
Research Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSIResearch Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSIDaniel S. Katz
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflowsDaniel S. Katz
 
Citation and reproducibility in software
Citation and reproducibility in softwareCitation and reproducibility in software
Citation and reproducibility in softwareDaniel S. Katz
 
Software Citation: Principles, Implementation, and Impact
Software Citation:  Principles, Implementation, and ImpactSoftware Citation:  Principles, Implementation, and Impact
Software Citation: Principles, Implementation, and ImpactDaniel S. Katz
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsDaniel S. Katz
 
Working towards Sustainable Software for Science: Practice and Experience (WS...
Working towards Sustainable Software for Science: Practice and Experience (WS...Working towards Sustainable Software for Science: Practice and Experience (WS...
Working towards Sustainable Software for Science: Practice and Experience (WS...Daniel S. Katz
 
20160607 citation4software opening
20160607 citation4software opening20160607 citation4software opening
20160607 citation4software openingDaniel S. Katz
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?Daniel S. Katz
 
Looking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSFLooking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSFDaniel S. Katz
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsDaniel S. Katz
 

More from Daniel S. Katz (20)

Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainability
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
 
URSSI
URSSIURSSI
URSSI
 
Research Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSIResearch Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSI
 
Software citation
Software citationSoftware citation
Software citation
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
 
Citation and reproducibility in software
Citation and reproducibility in softwareCitation and reproducibility in software
Citation and reproducibility in software
 
Software Citation: Principles, Implementation, and Impact
Software Citation:  Principles, Implementation, and ImpactSoftware Citation:  Principles, Implementation, and Impact
Software Citation: Principles, Implementation, and Impact
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groups
 
Working towards Sustainable Software for Science: Practice and Experience (WS...
Working towards Sustainable Software for Science: Practice and Experience (WS...Working towards Sustainable Software for Science: Practice and Experience (WS...
Working towards Sustainable Software for Science: Practice and Experience (WS...
 
20160607 citation4software opening
20160607 citation4software opening20160607 citation4software opening
20160607 citation4software opening
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?
 
Looking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSFLooking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSF
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative results
 

Recently uploaded

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Opinions on the State of Production Distributed Infrastructure (PDI)

  • 1. Opinions on the State of Production Distributed Infrastructure (PDI) Daniel S. Katz (dsk@ci.uchicago.edu) Senior Fellow, Computation Institute (University of Chicago & Argonne National Laboratory) (all the following are personal opinions only) www.ci.anl.gov www.ci.uchicago.edu
  • 2. State of the Art in PDI • PDI - a distributed set of computational hardware and software that is intended to allow multiple people who are not the developers of the infrastructure to do something useful • Three types of PDI exist: – Academic/Public Production (aimed at science) o TeraGrid/XSEDE, DEISA (HPC), OSG, EGI (HTC), Open Science Data Cloud (cloud), etc. – Academic/Public Research (aimed at computer science) o Grid5000, DAS, FutureGrid, PlanetLab, etc. o Open question – how to transition lessons to production – Commercial (aimed at whoever will pay) o Amazon (IaaS), Microsoft (Paas), perhaps SGI, Penguin (HPCaaS) • In addition, lots of other components that may not be integrated together: – Campus clusters (HPC), campus Condor pools (HTC) • All seem to do what they were designed to do reasonable well • Note: this view is mostly by funding and purpose • Note: Could also view by interface: – Command-line, grid (Globus, Condor, etc.), cloud www.ci.anl.gov 2 Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu www.ci.uchicago.edu
  • 3. User View of PDI • Need to answer: – Application developers: o How do I develop an applications? – What is the model? o How do I deploy an application? – Particularly hard for HPC – Applications users: o How do I find an application? – Application builders: o How do I compose an application? – First need to find components... – In all cases: o How do I execute an application? www.ci.anl.gov 3 Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu www.ci.uchicago.edu
  • 4. Open Challenges (in R&D, code, support, and/or policy) • Goal – deliver maximum science (at minimum cost?) – Note – sustainability seems to be a hot topic, but it seems to be defined as: useful work should continue to be done, with someone else paying for it • 2 views of this? – What are the components of infrastructure? o Hardware (nodes, interconnect, storage, network), software (system software, middleware, tools/libraries, applications), training (material, people), support (people), integration into PDIs – What’s the vision of “the” PDI? o For those who build the infrastructure components, what is the architecture? o For applications people, what is the abstraction of the PDI? www.ci.anl.gov 4 Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu www.ci.uchicago.edu
  • 5. Open Challenges (in R&D, code, support, and/or policy) (1) • Issues – How measure delivered science o We don’t really know how to do this – we can measure papers (w/ a small time delay) or citations (w/ longer time delay) – How to develop the infrastructure/tools that will best do this o I think we are doing reasonably well at this at an individual infrastructure/tool level—we identify things that users want to do, then provide tools/systems that let them do these things— but it’s not clear that we ever solve the whole integrated problem – How to integrate them o We can do this on a piece-by-piece basis, but don’t really know how to do this in general (maybe because we don’t have a single vision for where we are going) o Note that if we can do this, we can then allow the pieces to compete www.ci.anl.gov 5 Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu www.ci.uchicago.edu
  • 6. Open Challenges (in R&D, code, support, and/or policy) (2) • Issues – How to represent users o Large projects have the ability and opportunity to express their needs; small projects and individual users are often not represented – Role of virtualization in HPC o HPC executables are not portable now, but could be with virtualization o But virtual HPC applications don’t perform well (I/O, network issues) – How to build the application programming system that matches the abstract PDI? o We’ve had a nice run of ~20 years with the MPI model, which assumes a platform abstraction of interconnected nodes, each with CPUs and memory www.ci.anl.gov 6 Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu www.ci.uchicago.edu
  • 7. Path Forward • Greatest need is a single vision that defines the overall goal – NSF has CIF21 (http://www.nsf.gov/pubs/2010/nsf10015/nsf10015.pdf) which calls for such a vision, and 6 tasks forces that have created reports – DOE seems more focused on single large systems, and less on integrating them into an infrastructure – Some other US agencies looking at clouds, but not distribute computing in general – I haven’t seen a European vision that includes integration of all PDIs, though the UMD implies one for HPC and Grids • and is flexible enough to allow groups to work towards that goal in various ways – Should define metrics (to enable progress to be measured and see which work is most useful) – And an overall architecture with interfaces o OGF is trying to do some of this, but without sufficient US buy-in • DARPA funded effort in understanding user productivity in HPCS – but no one is doing this for PDIs www.ci.anl.gov 7 Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu www.ci.uchicago.edu
  • 8. Acknowledgements • First slide (and probably some other thinking) is based on chapter 3 of upcoming book: – S. Jha, D. S. Katz, M. Parashar, O. Rana, and J. Weissman. Abstractions for Distributed Applications and Systems. Wiley. • And technical report that’s coming sooner: – D. S. Katz, S. Jha, M. Parashar, O. Rana, and J. Weissman. Distributed Cyberinfastructures. Technical Report. Computation Institute, University of Chicago & Argonne National Laboratory, URL/number TBD • And related eSI research themes (DPA and 3DPAS) www.ci.anl.gov 8 Science Agency Use Cases (2011-07-18) dsk@ci.uchicago.edu www.ci.uchicago.edu