SlideShare a Scribd company logo
1 of 19
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 www.eudat.eu
Data Processing and
Analysis
EUDAT WP5 Service Building
Tom Kirkham
STFC
DATA PROCESSING AND ANALYSIS
- GEF
- Big Data Tools
- B2NOTE
- Data Distribution
Transfer large data collections from EUDAT
storage facilities to external HPC facilities for
processing
In conjunction with B2SAFE, replicate
community data sets, ingesting them onto
EUDAT storage resources for long-term
preservation
Ingest computation results into the EUDAT
infrastructure
B2STAGE provides API services to manage data
transfers between:
B2SAFE , B2HANDLE and B2ACCESS
The service allows users to: eudat.eu/b2stage
3EUDAT 6M EC Review, 28th October 2015, Brussels
RVIEW
• Access layer to the B2SAFE & B2FIND
services, to allow users to store, preserve and
find data
• Enables upload and Download Data transfers of
data objects to create collections
EUDAT 6M EC Review, 28th October 2015, Brussels
KGROUND
RS
FTP or
HTTP-API
FTP or
HTTP-API
GRESS
Achievements
- Integration between B2Handle, B2Access and
B2Safe
- Enablement of data movement into CDI
- HTTP API as a method for common access
●- Developed and released
Integration with Data Discovery Service and
standards support such as PID
●- Integration from community repositories with
B2SAFE via the HTTP API, the work done by
Charles University
●- Proof-of-concept of the HTTP API on plain
filesystems, for workspaces.
Future Status
- Development continues
- Application into specific tools and filestores
THE GENERIC EXECUTION
FRAMEWORK
(GEF)
Goal: Enable execution of containerised
software within CDI
Thus reducing data transfer and increasing
customisation for user communities.
Technology objectives
- Utilise EUDAT services B2Share, B2Drop
such as B2Safe (planned)
- Support a GEF rules engine (i.e. Drools)
- Integrate services into CDI from user
communities
GEF services/Docker containers
GEF services are Docker images that are specifically
annotated in order to allow handling by the GEF.
GEF service instances are Docker containers that are
spun up for execution close to the data.
User communities are solely responsible for the contents
of their images. During the pilot phase, communities will
receive support for creating their own images. But in the
long run, scientists will have to become proficient at it.
The GEF relies on so-called GEF services that are
customized by the user to perform the required tasks:
A GEF INSTANCE
The container/GEF service invocations on the hosts are
controlled by a Docker Machine integrated with a GEF
instance.
THE GENERIC EXECUTION
FRAMEWORK
(GEF)
Achievements
- Generic Execution Service (GEF) first
release in September.
- Integrated services from Earth Science Grid
- Federation (ESGF) and European Grid
Infrastructure (EGI) e-infrastructures
Future Work
- Integration into other communities such as
IS-ENES Climate4Impact platform
• Creation RDF triples
• Harvests information from ontology repositories
• Supports semi-automatic annotation using text mining
• Supports manual data annotation
• Easy to use user interface
• Write data on the triple store
• Integrates with the different EUDAT B2 services
11EUDAT 6M EC Review, 28th October 2015, Brussels
FEATURES
Achievements
B2Note module create to support creation of
annotations
Standards based and integrated with B2Share
B2Access integration enables users federated
access to resources
Software released in January and over 100
active users
Future Work
Integration into communities such as OpenAire
Future development in EOSC project
Easy integration into community services and
within OpenAIRE and EOSC-hub services
BIG DATA ANALYSIS
Goal: To open up data deposited in EUDAT CDI to
‘Big Data’ processing
Objectives:
Integrate ‘Big Data’ stack into CDI
To handle data from EUDAT components
Enable ‘Big Data analysis in user communities
BIG DATA ANALYSIS
BIG DATA ANALYSIS
Achievements
Apache Spark and Hadoop enabled in EUDAT
Data subscription service created to link analysis
results with user communities
Integrated within EUROARGO use case
Future Work
Further development and integration of data
subscription service into other projects such as
EOSC
DATA DISTRIBUTION SERVICE
Data Distribution in terms of discovery, transfer and
integration has been a core focus in this cluster
Federated integration of data
Data annotation layer aiding discovery
Integration with services via common API
Event based subscription of data
Beyond EUDAT this technology is reaching out into other
projects
Raising the possibility of a wider view on Data
Distribution as a Service.
SOME INITIAL THOUGHTS …
SUMMARY
Software released:
B2STAGE HTTP API
B2NOTE
Generic Execution Framework
Data Subscription Service
Community use to go beyond project
Projects actively working on software beyond
project i.e. EOSC-hub, SeaDataCloud etc
Questions
EUDAT Final Review, 21st May 2015, Brussels

More Related Content

What's hot

D3.4.1 Data fusion tools
D3.4.1 Data fusion toolsD3.4.1 Data fusion tools
D3.4.1 Data fusion toolsFOODIE_Project
 
Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Research Data Alliance
 
EGI-EUDAT interoperability| www.eudat.eu |
EGI-EUDAT interoperability| www.eudat.eu | EGI-EUDAT interoperability| www.eudat.eu |
EGI-EUDAT interoperability| www.eudat.eu | EUDAT
 
Sensors - The Sparkplug in the Engine of the Internet of Things
Sensors - The Sparkplug in the Engine of the Internet of ThingsSensors - The Sparkplug in the Engine of the Internet of Things
Sensors - The Sparkplug in the Engine of the Internet of ThingsRECAP Project
 
Inspire4 communities, communities4inspire final
Inspire4 communities, communities4inspire finalInspire4 communities, communities4inspire final
Inspire4 communities, communities4inspire finalKarel Charvat
 
Open Source Grid Middleware Packages
Open Source Grid Middleware  PackagesOpen Source Grid Middleware  Packages
Open Source Grid Middleware PackagesShivaramBose
 
2nd ARCADIA project newsletter
2nd ARCADIA project newsletter2nd ARCADIA project newsletter
2nd ARCADIA project newsletterEU ARCADIA PROJECT
 
Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La ...
Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La ...Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La ...
Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La ...EUDAT
 
Free and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data InfrastructuresFree and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data InfrastructuresEDINA, University of Edinburgh
 
The RECAP Project: Large Scale Simulation Framework
The RECAP Project: Large Scale Simulation FrameworkThe RECAP Project: Large Scale Simulation Framework
The RECAP Project: Large Scale Simulation FrameworkRECAP Project
 
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meeting
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meetingHNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meeting
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meetingHelix Nebula The Science Cloud
 
RECAP at the YERUN Launch Event
RECAP at the YERUN Launch EventRECAP at the YERUN Launch Event
RECAP at the YERUN Launch EventRECAP Project
 
Towards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTTowards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTHong-Linh Truong
 
SFScon21 - Simone Tritini - The Environmental Data Platform web portal
SFScon21 - Simone Tritini - The Environmental Data Platform web portalSFScon21 - Simone Tritini - The Environmental Data Platform web portal
SFScon21 - Simone Tritini - The Environmental Data Platform web portalSouth Tyrol Free Software Conference
 

What's hot (20)

D3.4.1 Data fusion tools
D3.4.1 Data fusion toolsD3.4.1 Data fusion tools
D3.4.1 Data fusion tools
 
Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...
 
EGI-EUDAT interoperability| www.eudat.eu |
EGI-EUDAT interoperability| www.eudat.eu | EGI-EUDAT interoperability| www.eudat.eu |
EGI-EUDAT interoperability| www.eudat.eu |
 
Sensors - The Sparkplug in the Engine of the Internet of Things
Sensors - The Sparkplug in the Engine of the Internet of ThingsSensors - The Sparkplug in the Engine of the Internet of Things
Sensors - The Sparkplug in the Engine of the Internet of Things
 
The XDC project
The XDC projectThe XDC project
The XDC project
 
Helix Nebula - The Science Cloud - Lessons learned
Helix Nebula - The Science Cloud - Lessons learned Helix Nebula - The Science Cloud - Lessons learned
Helix Nebula - The Science Cloud - Lessons learned
 
Inspire4 communities, communities4inspire final
Inspire4 communities, communities4inspire finalInspire4 communities, communities4inspire final
Inspire4 communities, communities4inspire final
 
Benchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systemsBenchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systems
 
Open Source Grid Middleware Packages
Open Source Grid Middleware  PackagesOpen Source Grid Middleware  Packages
Open Source Grid Middleware Packages
 
Configuring and Visualizing The Data Resources in a Cloud-based Data Collect...
Configuring and Visualizing The Data Resources  in a Cloud-based Data Collect...Configuring and Visualizing The Data Resources  in a Cloud-based Data Collect...
Configuring and Visualizing The Data Resources in a Cloud-based Data Collect...
 
Development of a Mobile Application for the C2NET Supply Chain Cloud–based P...
Development of a Mobile Application for the  C2NET Supply Chain Cloud–based P...Development of a Mobile Application for the  C2NET Supply Chain Cloud–based P...
Development of a Mobile Application for the C2NET Supply Chain Cloud–based P...
 
2nd ARCADIA project newsletter
2nd ARCADIA project newsletter2nd ARCADIA project newsletter
2nd ARCADIA project newsletter
 
Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La ...
Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La ...Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La ...
Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La ...
 
Free and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data InfrastructuresFree and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data Infrastructures
 
The RECAP Project: Large Scale Simulation Framework
The RECAP Project: Large Scale Simulation FrameworkThe RECAP Project: Large Scale Simulation Framework
The RECAP Project: Large Scale Simulation Framework
 
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meeting
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meetingHNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meeting
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meeting
 
RECAP at the YERUN Launch Event
RECAP at the YERUN Launch EventRECAP at the YERUN Launch Event
RECAP at the YERUN Launch Event
 
Towards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTTowards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoT
 
SFScon21 - Simone Tritini - The Environmental Data Platform web portal
SFScon21 - Simone Tritini - The Environmental Data Platform web portalSFScon21 - Simone Tritini - The Environmental Data Platform web portal
SFScon21 - Simone Tritini - The Environmental Data Platform web portal
 
The Science Cloud Users: Challenges and Needs
The Science Cloud Users: Challenges and NeedsThe Science Cloud Users: Challenges and Needs
The Science Cloud Users: Challenges and Needs
 

Similar to Data Processing and Analysis

Data Preservation Service Area
Data Preservation Service AreaData Preservation Service Area
Data Preservation Service AreaEUDAT
 
EUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT
 
EUDAT CDI Architecture
EUDAT CDI ArchitectureEUDAT CDI Architecture
EUDAT CDI ArchitectureEUDAT
 
The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project EGI Federation
 
NextGEOSS Webinar - Cloud APIs
NextGEOSS Webinar - Cloud APIsNextGEOSS Webinar - Cloud APIs
NextGEOSS Webinar - Cloud APIsterradue
 
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...vty
 
EUDAT Services Update
EUDAT Services UpdateEUDAT Services Update
EUDAT Services UpdateEUDAT
 
CPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxCPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxStephan Haller
 
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service AreaEUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service AreaEUDAT
 
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2vty
 
B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | EUDAT
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubBjörn Backeberg
 
EUDAT
EUDATEUDAT
EUDATEUDAT
 
User Documentation and Training Material
User Documentation and Training MaterialUser Documentation and Training Material
User Documentation and Training MaterialEUDAT
 
SSHOC Dataverse in the European Open Science Cloud
SSHOC Dataverse in the European Open Science CloudSSHOC Dataverse in the European Open Science Cloud
SSHOC Dataverse in the European Open Science Cloudvty
 
Coupling HPC and Data Resources and services together - EUDAT Workshop at exd...
Coupling HPC and Data Resources and services together - EUDAT Workshop at exd...Coupling HPC and Data Resources and services together - EUDAT Workshop at exd...
Coupling HPC and Data Resources and services together - EUDAT Workshop at exd...EUDAT
 
Cross e-Infrastructure collaborations
Cross e-Infrastructure collaborationsCross e-Infrastructure collaborations
Cross e-Infrastructure collaborationsEUDAT
 
Persistent Identifiers in EUDAT services| www.eudat.eu |
Persistent Identifiers in EUDAT services| www.eudat.eu | Persistent Identifiers in EUDAT services| www.eudat.eu |
Persistent Identifiers in EUDAT services| www.eudat.eu | EUDAT
 

Similar to Data Processing and Analysis (20)

Data Preservation Service Area
Data Preservation Service AreaData Preservation Service Area
Data Preservation Service Area
 
EUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT Generic Execution Framework
EUDAT Generic Execution Framework
 
EUDAT CDI Architecture
EUDAT CDI ArchitectureEUDAT CDI Architecture
EUDAT CDI Architecture
 
EUDAT B2SAFE & EOSC-hub
EUDAT B2SAFE & EOSC-hubEUDAT B2SAFE & EOSC-hub
EUDAT B2SAFE & EOSC-hub
 
The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project
 
NextGEOSS Webinar - Cloud APIs
NextGEOSS Webinar - Cloud APIsNextGEOSS Webinar - Cloud APIs
NextGEOSS Webinar - Cloud APIs
 
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
 
EUDAT Services Update
EUDAT Services UpdateEUDAT Services Update
EUDAT Services Update
 
CPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxCPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based Toolbox
 
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service AreaEUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
 
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
 
B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu |
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
 
EUDAT
EUDATEUDAT
EUDAT
 
Access Control in ESDIN: Shibboleth
Access Control in ESDIN: ShibbolethAccess Control in ESDIN: Shibboleth
Access Control in ESDIN: Shibboleth
 
User Documentation and Training Material
User Documentation and Training MaterialUser Documentation and Training Material
User Documentation and Training Material
 
SSHOC Dataverse in the European Open Science Cloud
SSHOC Dataverse in the European Open Science CloudSSHOC Dataverse in the European Open Science Cloud
SSHOC Dataverse in the European Open Science Cloud
 
Coupling HPC and Data Resources and services together - EUDAT Workshop at exd...
Coupling HPC and Data Resources and services together - EUDAT Workshop at exd...Coupling HPC and Data Resources and services together - EUDAT Workshop at exd...
Coupling HPC and Data Resources and services together - EUDAT Workshop at exd...
 
Cross e-Infrastructure collaborations
Cross e-Infrastructure collaborationsCross e-Infrastructure collaborations
Cross e-Infrastructure collaborations
 
Persistent Identifiers in EUDAT services| www.eudat.eu |
Persistent Identifiers in EUDAT services| www.eudat.eu | Persistent Identifiers in EUDAT services| www.eudat.eu |
Persistent Identifiers in EUDAT services| www.eudat.eu |
 

More from EUDAT

EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdfEUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdfEUDAT
 
EUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdfEUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdfEUDAT
 
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdfEUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdfEUDAT
 
EUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdfEUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdfEUDAT
 
EUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdfEUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdfEUDAT
 
EUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdfEUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdfEUDAT
 
EUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdfEUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdfEUDAT
 
EUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdfEUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdfEUDAT
 
EUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdfEUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdfEUDAT
 
Rob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT servicesRob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT servicesEUDAT
 
Ariyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationAriyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationEUDAT
 
Introduction to eudat and its services
Introduction to eudat and its servicesIntroduction to eudat and its services
Introduction to eudat and its servicesEUDAT
 
Using B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto PilotUsing B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto PilotEUDAT
 
OpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last weekOpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last weekEUDAT
 
European Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshopEuropean Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshopEUDAT
 
Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...EUDAT
 
FAIRness of training materials
FAIRness of training materialsFAIRness of training materials
FAIRness of training materialsEUDAT
 
Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...EUDAT
 
Draft Governance Framework for the EOSC
Draft Governance Framework for the EOSCDraft Governance Framework for the EOSC
Draft Governance Framework for the EOSCEUDAT
 
Building Interoperable AAI for Researchers
Building Interoperable AAI for ResearchersBuilding Interoperable AAI for Researchers
Building Interoperable AAI for ResearchersEUDAT
 

More from EUDAT (20)

EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdfEUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
 
EUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdfEUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdf
 
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdfEUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
 
EUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdfEUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdf
 
EUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdfEUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdf
 
EUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdfEUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdf
 
EUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdfEUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdf
 
EUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdfEUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdf
 
EUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdfEUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdf
 
Rob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT servicesRob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT services
 
Ariyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationAriyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentation
 
Introduction to eudat and its services
Introduction to eudat and its servicesIntroduction to eudat and its services
Introduction to eudat and its services
 
Using B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto PilotUsing B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto Pilot
 
OpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last weekOpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last week
 
European Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshopEuropean Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshop
 
Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...
 
FAIRness of training materials
FAIRness of training materialsFAIRness of training materials
FAIRness of training materials
 
Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...
 
Draft Governance Framework for the EOSC
Draft Governance Framework for the EOSCDraft Governance Framework for the EOSC
Draft Governance Framework for the EOSC
 
Building Interoperable AAI for Researchers
Building Interoperable AAI for ResearchersBuilding Interoperable AAI for Researchers
Building Interoperable AAI for Researchers
 

Recently uploaded

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

Data Processing and Analysis

  • 1. EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 www.eudat.eu Data Processing and Analysis EUDAT WP5 Service Building Tom Kirkham STFC
  • 2. DATA PROCESSING AND ANALYSIS - GEF - Big Data Tools - B2NOTE - Data Distribution
  • 3. Transfer large data collections from EUDAT storage facilities to external HPC facilities for processing In conjunction with B2SAFE, replicate community data sets, ingesting them onto EUDAT storage resources for long-term preservation Ingest computation results into the EUDAT infrastructure B2STAGE provides API services to manage data transfers between: B2SAFE , B2HANDLE and B2ACCESS The service allows users to: eudat.eu/b2stage 3EUDAT 6M EC Review, 28th October 2015, Brussels RVIEW
  • 4. • Access layer to the B2SAFE & B2FIND services, to allow users to store, preserve and find data • Enables upload and Download Data transfers of data objects to create collections EUDAT 6M EC Review, 28th October 2015, Brussels KGROUND
  • 6. GRESS Achievements - Integration between B2Handle, B2Access and B2Safe - Enablement of data movement into CDI - HTTP API as a method for common access ●- Developed and released Integration with Data Discovery Service and standards support such as PID ●- Integration from community repositories with B2SAFE via the HTTP API, the work done by Charles University ●- Proof-of-concept of the HTTP API on plain filesystems, for workspaces. Future Status - Development continues - Application into specific tools and filestores
  • 7. THE GENERIC EXECUTION FRAMEWORK (GEF) Goal: Enable execution of containerised software within CDI Thus reducing data transfer and increasing customisation for user communities. Technology objectives - Utilise EUDAT services B2Share, B2Drop such as B2Safe (planned) - Support a GEF rules engine (i.e. Drools) - Integrate services into CDI from user communities
  • 8. GEF services/Docker containers GEF services are Docker images that are specifically annotated in order to allow handling by the GEF. GEF service instances are Docker containers that are spun up for execution close to the data. User communities are solely responsible for the contents of their images. During the pilot phase, communities will receive support for creating their own images. But in the long run, scientists will have to become proficient at it. The GEF relies on so-called GEF services that are customized by the user to perform the required tasks:
  • 9. A GEF INSTANCE The container/GEF service invocations on the hosts are controlled by a Docker Machine integrated with a GEF instance.
  • 10. THE GENERIC EXECUTION FRAMEWORK (GEF) Achievements - Generic Execution Service (GEF) first release in September. - Integrated services from Earth Science Grid - Federation (ESGF) and European Grid Infrastructure (EGI) e-infrastructures Future Work - Integration into other communities such as IS-ENES Climate4Impact platform
  • 11. • Creation RDF triples • Harvests information from ontology repositories • Supports semi-automatic annotation using text mining • Supports manual data annotation • Easy to use user interface • Write data on the triple store • Integrates with the different EUDAT B2 services 11EUDAT 6M EC Review, 28th October 2015, Brussels FEATURES
  • 12. Achievements B2Note module create to support creation of annotations Standards based and integrated with B2Share B2Access integration enables users federated access to resources Software released in January and over 100 active users Future Work Integration into communities such as OpenAire Future development in EOSC project Easy integration into community services and within OpenAIRE and EOSC-hub services
  • 13. BIG DATA ANALYSIS Goal: To open up data deposited in EUDAT CDI to ‘Big Data’ processing Objectives: Integrate ‘Big Data’ stack into CDI To handle data from EUDAT components Enable ‘Big Data analysis in user communities
  • 15. BIG DATA ANALYSIS Achievements Apache Spark and Hadoop enabled in EUDAT Data subscription service created to link analysis results with user communities Integrated within EUROARGO use case Future Work Further development and integration of data subscription service into other projects such as EOSC
  • 16. DATA DISTRIBUTION SERVICE Data Distribution in terms of discovery, transfer and integration has been a core focus in this cluster Federated integration of data Data annotation layer aiding discovery Integration with services via common API Event based subscription of data Beyond EUDAT this technology is reaching out into other projects Raising the possibility of a wider view on Data Distribution as a Service.
  • 18. SUMMARY Software released: B2STAGE HTTP API B2NOTE Generic Execution Framework Data Subscription Service Community use to go beyond project Projects actively working on software beyond project i.e. EOSC-hub, SeaDataCloud etc
  • 19. Questions EUDAT Final Review, 21st May 2015, Brussels