SlideShare a Scribd company logo
Gateway Optimal Resource Selection and Integrated
                                                     Information Services Framework
                                              Xuan Wu, Deivasigamani Suresh Kumar, Raminder Singh, Suresh Marru, Marlon Pierce
                                                              Pervasive Technology Institute, Indiana University


                                                      Goals                                                           Usage Scenario
                                                                                                                          Usage
      Various TeraGrid information and monitoring services provide                      One Time Registration:
      valuable information but are scattered among TeraGrid                             • Step A: Register gateway used community and research
      Information Services (TGIS), TeraGrid news maintenance feed,                        applications and any available performance data.
      INCA resource monitoring, Karnak system status, start time and
                                                                                        • Step B: Register all gateway resources and the utilized grid
      wait time prediction and Speedpage file transfer monitoring and
                                                                                          middleware like GRAM5, GridFTP.
      estimates. The TGIS provides one stop shop for these services,
      but the data is discrete.                                                         Example Queries:
      Gateways would like to get the information they need by a single                  1. Given a job configuration (number of processors, wall time),
      query. Motivated by the needs of LEAD, UltraScan and GridChem                        return all healthy resources sorted by their lowest start time.
      gateways, we developed Optimal Resource Prediction Service                        2. Gateway Specific Resource Summary: List the status of all
      (ORPS). The goals of ORPS are:                                                       resources used by a gateway (Ultrascan, GridChem). Status of
      • Integrate all information sources into a single yes or no                       • Job Management, File Transfer, GSISSH and login nodes
         answer. Is the resource healthy should verify if the machine is                • Over all health. GOOD if all above three services are healthy;
         in maintenance, at least one file transfer service and a job                      BAD if any one of them is down, UNKNOWN if any testing
         management service are up and functional.                                         results returns unknown; SCHEDULED MAINTAINCE.
      • Predict which resources will be optimal to run the next                         • Current Load: number of waiting/running jobs and usage.
         compute job based on information provided by karnak start                                     Hosted Service or Download & Deploy
         time prediction, pre-determined application performance and
         estimated run-time.                                                            Hosted Service Example:
                                                                                        http://ogceportal.iu.teragrid.org:19444/orps-
                                                  Architecture                          service/XML/gateway/$(gatewayId)
                                                                                        Download, build and deploy:
                                                                                        1. svn co https://ogce.svn.sourceforge.net/svnroot/ogce/incubator/ORPS
                                                                                        2. mvn clean install
                                                                                        3. Configure data collection schedules, database for caching, ports
                                                                                        4. start.sh




                                                 Salient Features
      • ORPS is a flexible and extensible architecture developed in                                               Status & Future Work
        java over the Spring MVC framework. The framework adapts to                     Phase I (completed):
        the emerging information sources.                                               • ORPS is currently integrated into UltraScan production gateway.
      • External information services send information though                           • Working with GridChem gateway developers to integrate into
        subscriptions (push) or by periodic polls.                                        development environment.
      • The scheduler polls different sources based on their update                     Phase II (in development):
        frequency and data is provided downstream in near-real-time.                    • Application specific scheduling: get all healthy resources to run
      • ORPS exposes the raw & mashed up information to gateways                          Gaussian on TeraGrid. Selection based on: Queue wait time +
        through REST interfaces.                                                          Gaussian relative performance data + bandwidth estimates
      • Information sources update schedule-aware multi-level                                                    Acknowledgement
        databases cache to serve surge of job submission requests
        from gateways.                                                                  The Authors would like to thank the INCA, TGIS and Karnak teams
                                                                                        for valuable discussions and support and UltraScan and GridChem
      • The determined health and schedule is cached in second level                    gateways for requirement and integration.
        to ensure quick response time < 100ms.
                                                                                        This work is partially supported by TeraGrid Gateway Advanced
      • Detailed test failures are provided to assist in determining                    Support Activity and Open Gateway Computing Environments NSF
        transient vs persistent failures.                                               SDCI Grant No: OCI-1032742.
RESEARCH POSTER PRESENTATION DESIGN © 2011

www.PosterPresentation
s.com

More Related Content

What's hot

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
Scalable olap with druid
Scalable olap with druidScalable olap with druid
Scalable olap with druid
Kashif Khan
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
marpierc
 
Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scale
DataWorks Summit
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
marpierc
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
DataWorks Summit
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
inside-BigData.com
 
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
J On The Beach
 
The Sierra Supercomputer: Science and Technology on a Mission
The Sierra Supercomputer: Science and Technology on a MissionThe Sierra Supercomputer: Science and Technology on a Mission
The Sierra Supercomputer: Science and Technology on a Mission
inside-BigData.com
 
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
DataWorks Summit
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshop
balmanme
 
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test ResultsUncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
DataWorks Summit
 
Druid Scaling Realtime Analytics
Druid Scaling Realtime AnalyticsDruid Scaling Realtime Analytics
Druid Scaling Realtime Analytics
Aaron Brooks
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overview
marpierc
 
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Big Data Spain
 
Goto amsterdam-2013-skinned
Goto amsterdam-2013-skinnedGoto amsterdam-2013-skinned
Goto amsterdam-2013-skinned
Ted Dunning
 
Distributed Database practicals
Distributed Database practicals Distributed Database practicals
Distributed Database practicals
Vrushali Lanjewar
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
Ted Dunning
 
Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
DataWorks Summit/Hadoop Summit
 
YARN Federation
YARN Federation YARN Federation

What's hot (20)

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
 
Scalable olap with druid
Scalable olap with druidScalable olap with druid
Scalable olap with druid
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scale
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
 
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
 
The Sierra Supercomputer: Science and Technology on a Mission
The Sierra Supercomputer: Science and Technology on a MissionThe Sierra Supercomputer: Science and Technology on a Mission
The Sierra Supercomputer: Science and Technology on a Mission
 
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshop
 
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test ResultsUncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
 
Druid Scaling Realtime Analytics
Druid Scaling Realtime AnalyticsDruid Scaling Realtime Analytics
Druid Scaling Realtime Analytics
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overview
 
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
 
Goto amsterdam-2013-skinned
Goto amsterdam-2013-skinnedGoto amsterdam-2013-skinned
Goto amsterdam-2013-skinned
 
Distributed Database practicals
Distributed Database practicals Distributed Database practicals
Distributed Database practicals
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
 
Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
 
YARN Federation
YARN Federation YARN Federation
YARN Federation
 

Similar to TG11 ORPS Poster

Opentracing jaeger
Opentracing jaegerOpentracing jaeger
Opentracing jaeger
Oracle Korea
 
Distributed Tracing with Jaeger
Distributed Tracing with JaegerDistributed Tracing with Jaeger
Distributed Tracing with Jaeger
Inho Kang
 
Dynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersDynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using Containers
IRJET Journal
 
Cluster and Grid Computing
Cluster and Grid ComputingCluster and Grid Computing
Cluster and Grid Computing
Sayed Chhattan Shah
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup
Suman Karumuri
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environments
Docker, Inc.
 
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
DataStax
 
Sharing resources with non-Hadoop workloads
Sharing resources with non-Hadoop workloadsSharing resources with non-Hadoop workloads
Sharing resources with non-Hadoop workloads
DataWorks Summit
 
Presentacion de solucion cloud de navegacion segura
Presentacion de solucion cloud de navegacion seguraPresentacion de solucion cloud de navegacion segura
Presentacion de solucion cloud de navegacion segura
RogerChaucaZea
 
Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival
Digital Health Enterprise Zone
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Support
marpierc
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
Rakuten Group, Inc.
 
Kafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC Federal
Kafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC FederalKafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC Federal
Kafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC Federal
HostedbyConfluent
 
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Evention
 
Elephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud readyElephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud ready
Krzysztof Adamski
 
Elephants in the cloud or How to become cloud ready
Elephants in the cloud or How to become cloud readyElephants in the cloud or How to become cloud ready
Elephants in the cloud or How to become cloud ready
GetInData
 
2009.08 grid peer-slides
2009.08 grid peer-slides2009.08 grid peer-slides
2009.08 grid peer-slides
Yehia El-khatib
 
OGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research TechnologiesOGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research Technologies
marpierc
 
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial IntroOGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
marpierc
 
Cs6703 grid and cloud computing unit 4
Cs6703 grid and cloud computing unit 4Cs6703 grid and cloud computing unit 4
Cs6703 grid and cloud computing unit 4
RMK ENGINEERING COLLEGE, CHENNAI
 

Similar to TG11 ORPS Poster (20)

Opentracing jaeger
Opentracing jaegerOpentracing jaeger
Opentracing jaeger
 
Distributed Tracing with Jaeger
Distributed Tracing with JaegerDistributed Tracing with Jaeger
Distributed Tracing with Jaeger
 
Dynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersDynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using Containers
 
Cluster and Grid Computing
Cluster and Grid ComputingCluster and Grid Computing
Cluster and Grid Computing
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environments
 
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
 
Sharing resources with non-Hadoop workloads
Sharing resources with non-Hadoop workloadsSharing resources with non-Hadoop workloads
Sharing resources with non-Hadoop workloads
 
Presentacion de solucion cloud de navegacion segura
Presentacion de solucion cloud de navegacion seguraPresentacion de solucion cloud de navegacion segura
Presentacion de solucion cloud de navegacion segura
 
Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Support
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
Kafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC Federal
Kafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC FederalKafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC Federal
Kafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC Federal
 
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
 
Elephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud readyElephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud ready
 
Elephants in the cloud or How to become cloud ready
Elephants in the cloud or How to become cloud readyElephants in the cloud or How to become cloud ready
Elephants in the cloud or How to become cloud ready
 
2009.08 grid peer-slides
2009.08 grid peer-slides2009.08 grid peer-slides
2009.08 grid peer-slides
 
OGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research TechnologiesOGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research Technologies
 
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial IntroOGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
 
Cs6703 grid and cloud computing unit 4
Cs6703 grid and cloud computing unit 4Cs6703 grid and cloud computing unit 4
Cs6703 grid and cloud computing unit 4
 

More from marpierc

IWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache AiravataIWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache Airavata
marpierc
 
XSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata TutorialXSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata Tutorial
marpierc
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-final
marpierc
 
SC11 Science Gateway Group Overview
SC11 Science Gateway Group OverviewSC11 Science Gateway Group Overview
SC11 Science Gateway Group Overview
marpierc
 
Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation
marpierc
 
OGCE MSI Presentation
OGCE MSI PresentationOGCE MSI Presentation
OGCE MSI Presentation
marpierc
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogce
marpierc
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
marpierc
 
Building Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocialBuilding Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocial
marpierc
 
OGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA SupportOGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA Support
marpierc
 
Ogce about-sc10
Ogce about-sc10Ogce about-sc10
Ogce about-sc10
marpierc
 
OGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track PresentationOGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track Presentation
marpierc
 
GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009
marpierc
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009
marpierc
 
GTLAB Overview
GTLAB OverviewGTLAB Overview
GTLAB Overview
marpierc
 

More from marpierc (15)

IWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache AiravataIWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache Airavata
 
XSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata TutorialXSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata Tutorial
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-final
 
SC11 Science Gateway Group Overview
SC11 Science Gateway Group OverviewSC11 Science Gateway Group Overview
SC11 Science Gateway Group Overview
 
Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation
 
OGCE MSI Presentation
OGCE MSI PresentationOGCE MSI Presentation
OGCE MSI Presentation
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogce
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
 
Building Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocialBuilding Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocial
 
OGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA SupportOGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA Support
 
Ogce about-sc10
Ogce about-sc10Ogce about-sc10
Ogce about-sc10
 
OGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track PresentationOGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track Presentation
 
GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009
 
GTLAB Overview
GTLAB OverviewGTLAB Overview
GTLAB Overview
 

Recently uploaded

Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 

Recently uploaded (20)

Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 

TG11 ORPS Poster

  • 1. Gateway Optimal Resource Selection and Integrated Information Services Framework Xuan Wu, Deivasigamani Suresh Kumar, Raminder Singh, Suresh Marru, Marlon Pierce Pervasive Technology Institute, Indiana University Goals Usage Scenario Usage Various TeraGrid information and monitoring services provide One Time Registration: valuable information but are scattered among TeraGrid • Step A: Register gateway used community and research Information Services (TGIS), TeraGrid news maintenance feed, applications and any available performance data. INCA resource monitoring, Karnak system status, start time and • Step B: Register all gateway resources and the utilized grid wait time prediction and Speedpage file transfer monitoring and middleware like GRAM5, GridFTP. estimates. The TGIS provides one stop shop for these services, but the data is discrete. Example Queries: Gateways would like to get the information they need by a single 1. Given a job configuration (number of processors, wall time), query. Motivated by the needs of LEAD, UltraScan and GridChem return all healthy resources sorted by their lowest start time. gateways, we developed Optimal Resource Prediction Service 2. Gateway Specific Resource Summary: List the status of all (ORPS). The goals of ORPS are: resources used by a gateway (Ultrascan, GridChem). Status of • Integrate all information sources into a single yes or no • Job Management, File Transfer, GSISSH and login nodes answer. Is the resource healthy should verify if the machine is • Over all health. GOOD if all above three services are healthy; in maintenance, at least one file transfer service and a job BAD if any one of them is down, UNKNOWN if any testing management service are up and functional. results returns unknown; SCHEDULED MAINTAINCE. • Predict which resources will be optimal to run the next • Current Load: number of waiting/running jobs and usage. compute job based on information provided by karnak start Hosted Service or Download & Deploy time prediction, pre-determined application performance and estimated run-time. Hosted Service Example: http://ogceportal.iu.teragrid.org:19444/orps- Architecture service/XML/gateway/$(gatewayId) Download, build and deploy: 1. svn co https://ogce.svn.sourceforge.net/svnroot/ogce/incubator/ORPS 2. mvn clean install 3. Configure data collection schedules, database for caching, ports 4. start.sh Salient Features • ORPS is a flexible and extensible architecture developed in Status & Future Work java over the Spring MVC framework. The framework adapts to Phase I (completed): the emerging information sources. • ORPS is currently integrated into UltraScan production gateway. • External information services send information though • Working with GridChem gateway developers to integrate into subscriptions (push) or by periodic polls. development environment. • The scheduler polls different sources based on their update Phase II (in development): frequency and data is provided downstream in near-real-time. • Application specific scheduling: get all healthy resources to run • ORPS exposes the raw & mashed up information to gateways Gaussian on TeraGrid. Selection based on: Queue wait time + through REST interfaces. Gaussian relative performance data + bandwidth estimates • Information sources update schedule-aware multi-level Acknowledgement databases cache to serve surge of job submission requests from gateways. The Authors would like to thank the INCA, TGIS and Karnak teams for valuable discussions and support and UltraScan and GridChem • The determined health and schedule is cached in second level gateways for requirement and integration. to ensure quick response time < 100ms. This work is partially supported by TeraGrid Gateway Advanced • Detailed test failures are provided to assist in determining Support Activity and Open Gateway Computing Environments NSF transient vs persistent failures. SDCI Grant No: OCI-1032742. RESEARCH POSTER PRESENTATION DESIGN © 2011 www.PosterPresentation s.com

Editor's Notes

  1. The framework provides application registry capabilities to register the resources and applications used by a gateway. Application performance models can be plugged to update performance data on a specific host. Once registered the gateway can query for real time status information and the framework will provide status determined by ensuring the required File Transfer and Job Management interfaces are healthy. In a first order, resources in maintenance, faulty job managers, overwhelmed gridftp servers are eliminated for scheduling. Further marshaling the karnak and speed page job queue and file transfer information increases gateway job success rates and turn around times.