SlideShare a Scribd company logo
Lydia Heck, Campus network engineering workshop
19/10/2016
Archiving data from Durham to RAL using the FileTransfer
Service (FTS)
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
2
Archiving data from Durham to
RAL using the File Transfer
Service (FTS)
Lydia Heck
Institute for Computational Cosmology
Manager of the DiRAC-2/2.5 Data Centric Facility
COSMA
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
3
Introduction to DiRAC
l
DIRAC -- Distributed Research utilising Advanced
Computing established in 2009 with DiRAC-1
• Support of research in theoretical astronomy, particle physics and
nuclear physics
• Funded by STFC with infrastructure money allocated from the
Department for Business, Innovation and Skills (BIS)
• The running costs, such as staff costs and electricity are funded by
STFC
• DiRAC is classed as a major research facility by STFC on a par with the big
telescopes
What is DiRAC
l A national service run/managed/allocated by the
scientists who do the science funded by BIS and
STFC
l The systems are built around and for the applications
with which the science is done.
l We do not rival a facility like ARCHER, as we do not
aspire to run a general national service.
19 October 2016 4Campus Network Engineering for Data
Intensive Science Workshop
What is DiRAC – cont’d?
l For the highlights of science carried out on the
DiRAC facility please see:
http://www.dirac.ac.uk/science.html
l Specific example: Large scale structure
calculations with the Eagle run
4096 cores
~8 GB RAM/core
47 days = 4,620,288 cpu hours
200 TB of data
19 October 2016 5Campus Network Engineering for Data
Intensive Science Workshop
The DiRAC computing systems
19 October 2016 6Campus Network Engineering for Data
Intensive Science Workshop
Blue Gene
Edinburgh
Cosmos
Cambridge
Complexity
Leicester
Data Centric
Durham
Data Analytic
Cambridge
COSMA @ DiRAC (Data Centric)
Durham – Data Centric
system –IBM IDataplex
6720 Intel Sandy Bridge
cores
53.8 TB of RAM
FDR10 infiniband 2:1
blocking
2.5 Pbyte of GPFS
storage (2.2 Pbyte used!)
19 October 2016 7Campus Network Engineering for Data
Intensive Science Workshop
Resources of DiRAC
l Long projects with significant amount of CPU hours allocated
for 3 years typically on a specific system on one or more of the
available 5 systems. Resources available:
l
l
l
l
l
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
8
System cpu hours storage location
Bluegene 98,304 cores 861 M 1 PB (GPFS) Edinburgh
Data Centric 6720 Xeon
cores
59 M 2.5 PB (GPFS) Durham (DiRAC2)
Data Centric 8000 Xeon
cores
> 71 M 2.5 PB data (Lustre)
1.8 PB scratch (Lustre)
Durham (DiRAC2.5)
Complexity 4352 Xeon
cores
38 M 0.8 PB (Panasas) Leicester
Data Analytic 4800 Xeon
cores
42 M 0.75 PB (Lustre) Cambridge
SMP 1784 Xeon cores
shared memory
15.6M 146 TB (EXT) Cambridge
Why do we need to copy data ?
During and when a project is completed copy data to home institutions
l requires additional storage resource at researchers’ home institutions
l Not enough provision – will require additional funds.
Make backup copies
l if disaster struck many cpu hours of calculations would be lost.
Copy data to other sites to leverage compute resources for post processing.
Storage on HPC facility runs out of capacity
data creation considerably above expectation ?
l
19 October 2016 9Campus Network Engineering for Data
Intensive Science Workshop
Why do we copy data to RAL ?
Research data must now be available to interested parties for
specified period of time
l We could install DiRAC's own archive
• requires funds and there is (currently) no budget
We needed to get started:
l to gain experience
l to get a valid backup
l to remove data as the resources run out
l Identify bottlenecks and technical challenges
Jeremy Yates (Director of DiRAC) negotiated access to the
RAL archiving systems
Set up collaborations and make use of previous experience
and pool resources
AND: copy data!
l
l
19 October 2016 10Campus Network Engineering for Data
Intensive Science Workshop
Network connectivity of Durham University
• 2012 – upgrade to 4x1 Gbit to Janet
• Janet advised to investigate optimal utilisation of
available bandwith before applying for further upgrade
• 2014 – upgrade to 6 Gbit to Janet
• currently: 8 Gbit to Janet should be a full 10 Gbit by the
end of the year – technical issues
19 October 2016 11Campus Network Engineering for Data
Intensive Science Workshop
network bandwidth – situation for Durham
l 2014: Measured throughput ?
l
l
19 October 2016 12Campus Network Engineering for Data
Intensive Science Workshop
2014: Measured Limits ?
l
l
19 October 2016 13Campus Network Engineering for Data
Intensive Science Workshop
September 2014 – Measured limits
l
l
19 October 2016 14Campus Network Engineering for Data
Intensive Science Workshop
Making optimal use of available bandwidth
• planning and investment to by-pass the external campus firewall:
• Prepartory work started in October/November 2014 two new
routers (~£80k) – configured for throughput with minimal ACL
enough to safeguard site.
• deploying internal firewalls – part of new security
infrastructure anyhow but essential for such a venture
• security now relies on front-end systems of Durham DiRAC
and Durham GridPP
• IPPP was moved outside the firewall in April 2015 with a clear
mandate to manage security for their installation.
• The DiRAC Data Transfer system was moved outside about 1
month later.
19 October 2016 15Campus Network Engineering for Data
Intensive Science Workshop
GridPP Site FW config for endpoint node
19 October 2016 16Campus Network Engineering for Data
Intensive Science Workshop
GridFTP
Port
blocking
GridFTP
Pass
thru
GridFTP
GridFTP
Monitor
w/fw
GridFTP
Bypass
site fw
Result for DiRAC and GridPP in Durham
• guaranteed 3 Gbit/sec in/out
• Consequences:
• pushed the network performance for Durham GridPP from bottom 3 in the
country to top 5 of the UK GridPP sites
• Now they experience different bottlenecks, but they under their control
• DiRAC data transfers achieve up to 300 – 400 Mbyte/sec throughput to RAL
on archiving depending on file sizes.
• faster data sharing with other collaboration sites
• recently (October 2016) offered service to Earth Sciences with 70-80
MByte/sec from site in Switzerland
•
19 October 2016 17Campus Network Engineering for Data
Intensive Science Workshop
Collaboration between DiRAC and
GridPP/RAL
l Durham Institute for Computational Cosmology (ICC)
volunteered to be the prototype installation
l Huge thanks to Jens Jensen and Brian Davies - there
were many emails exchanged, many questions asked
and many answers given.
l Resulting document
“Setting up a system for data archiving using FTS3” by
Lydia Heck, Jens Jensen and Brian Davies
19 October 2016 18Campus Network Engineering for Data
Intensive Science Workshop
l https://www.cosma.dur.ac.uk/documentation
Setting up the archiving tools
l Identify appropriate hardware – could mean
extra expense:
need freedom to modify and experiment with
 cannot have HPC users logged in and working
when you need to reboot the system!
l free to do very latest security updates
 This might not always be possible on an HPC
system
l requires optimal connection to storage
 For the transfer system this meant an infiniband
card
19 October 2016 19Campus Network Engineering for Data
Intensive Science Workshop
Setting up the archiving tools
l Create an interface to access the file/archving
service at RAL using the GridPP tools
• gridftp – Globus Toolkit – also provides Globus
Connect
• Trust anchors (egi-trustanchors)
• voms tools (emi3-xxx)
• fts3 (cern)
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
20
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
21
Chose to use FTS3 with
GridFTP
User submits transfer
lists
(and credentials)
GPFS
data.cosma.dur.ac.uk
(GridFTP)
CASTOR-GEN
srm-dirac.gridpp.rl.ac.uk
(SRM)
GridFTP
FTS3
Learning to use certificates and proxies 
l long-lived voms proxy?
l myproxy-init; myproxy-logon; voms-proxy-init; fts-transfer-
delegation
l How to create a proxy and delegation that lasts weeks
even months?
l This is still an issue for a voms proxy. But circumvented it
using normal proxy.
l grid-proxy-init; fts-transfer-delegation
l grid-proxy-init –valid HH:MM
l fts-transfer-delegation –e time-in-seconds
l creates proxy that lasts up to certificate life time.
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
22
Experiences
1. Large files – optimal throughput limited by network bandwidth
2. Many small files – limited by latency
3. many parallel sessions: impedes on proper functioning of
archive server.
4. Ownership, creation dates not preserved – one grid owner
5. Simple approach of “just” pushing files will not work!
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
23
Actions to overcome issues
• tar files up in chunks - ~256 Gbyte
• exclude checked out versioning subdirectories
• preserves ownership, and time stamps in the tar archive
• keep record of archived files
• Files to transfer are large – limited by bandwidth, not by latency
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
24
Open issues
l depends on single admin to carry out. Not
automatic.
l what happens when content in directories
change? – complete new archive sessions?
l Create a tool more like rsync – requires
extensive scripting
l When trying to get data back, get back all of a
subset, to find single or string of files
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
25
Conclusions
l With the right network speed we can archive the DiRAC data to
RAL or anywhere else with the right tools and connectivity.
l Documenting the procedure is very important to transfer the
knowledge and duplicating effort. The documentation is online
https://www.cosma.dur.ac.uk/documentation
l Each DiRAC site should have their own dirac0X account
l Start with and keep on archiving – this is more difficult as it is
not completely automatic yet and more development is
required.
l Collaboration between DiRAC and GridPP/RAL DOES work!
l The work has been of benefit to other transfer actions, which
significantly helps research and reflects well on the service we
can deliver.
l Can we aspire to more?
19 October 2016 Campus Network Engineering for Data
Intensive Science Workshop
26

More Related Content

What's hot

Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
Pascal-Nicolas Becker
 
EDINA National Datacentre Activity Update to GWG
EDINA National Datacentre Activity Update to GWGEDINA National Datacentre Activity Update to GWG
EDINA National Datacentre Activity Update to GWG
EDINA, University of Edinburgh
 
AddressingHistory: Lessons and Messages
AddressingHistory:  Lessons and MessagesAddressingHistory:  Lessons and Messages
AddressingHistory: Lessons and Messages
EDINA, University of Edinburgh
 
Ariadne: Lifecycles
Ariadne: LifecyclesAriadne: Lifecycles
Ariadne: Lifecycles
ariadnenetwork
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
Enno Meijers
 
Multimedia-2016_Brochure
Multimedia-2016_BrochureMultimedia-2016_Brochure
Multimedia-2016_Brochure
Gracy Jones
 
Geoservices Activities at EDINA
Geoservices Activities at EDINAGeoservices Activities at EDINA
Geoservices Activities at EDINA
EDINA, University of Edinburgh
 
ORDS, research data network
ORDS, research data networkORDS, research data network
ORDS, research data network
Jisc RDM
 
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
Phidias
 
Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
Tobias Kuhn
 
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE EU
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
Miha Ahronovitz
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
Tal Lavian Ph.D.
 
The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?
Carole Goble
 
Ariadne: Interoperability
Ariadne: InteroperabilityAriadne: Interoperability
Ariadne: Interoperability
ariadnenetwork
 
A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...
A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...
A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...
ariadnenetwork
 
Open Access Repository Junction
Open Access Repository JunctionOpen Access Repository Junction
Open Access Repository Junction
EDINA, University of Edinburgh
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
Enno Meijers
 
Challenges and Issues of Next Cloud Computing Platforms
Challenges and Issues of Next Cloud Computing PlatformsChallenges and Issues of Next Cloud Computing Platforms
Challenges and Issues of Next Cloud Computing Platforms
Frederic Desprez
 
Progress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectProgress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP Project
Helix Nebula The Science Cloud
 

What's hot (20)

Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
 
EDINA National Datacentre Activity Update to GWG
EDINA National Datacentre Activity Update to GWGEDINA National Datacentre Activity Update to GWG
EDINA National Datacentre Activity Update to GWG
 
AddressingHistory: Lessons and Messages
AddressingHistory:  Lessons and MessagesAddressingHistory:  Lessons and Messages
AddressingHistory: Lessons and Messages
 
Ariadne: Lifecycles
Ariadne: LifecyclesAriadne: Lifecycles
Ariadne: Lifecycles
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
Multimedia-2016_Brochure
Multimedia-2016_BrochureMultimedia-2016_Brochure
Multimedia-2016_Brochure
 
Geoservices Activities at EDINA
Geoservices Activities at EDINAGeoservices Activities at EDINA
Geoservices Activities at EDINA
 
ORDS, research data network
ORDS, research data networkORDS, research data network
ORDS, research data network
 
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
 
Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
 
The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?
 
Ariadne: Interoperability
Ariadne: InteroperabilityAriadne: Interoperability
Ariadne: Interoperability
 
A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...
A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...
A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...
 
Open Access Repository Junction
Open Access Repository JunctionOpen Access Repository Junction
Open Access Repository Junction
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
 
Challenges and Issues of Next Cloud Computing Platforms
Challenges and Issues of Next Cloud Computing PlatformsChallenges and Issues of Next Cloud Computing Platforms
Challenges and Issues of Next Cloud Computing Platforms
 
Progress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectProgress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP Project
 

Viewers also liked

110G networking within JASMIN
110G networking within JASMIN110G networking within JASMIN
110G networking within JASMIN
Jisc
 
Challenges in end-to-end performance
Challenges in end-to-end performanceChallenges in end-to-end performance
Challenges in end-to-end performance
Jisc
 
Provisioning Janet
Provisioning JanetProvisioning Janet
Provisioning Janet
Jisc
 
Science DMZ at Imperial
Science DMZ at ImperialScience DMZ at Imperial
Science DMZ at Imperial
Jisc
 
Solving Network Throughput Problems at the Diamond Light Source
Solving Network Throughput Problems at the Diamond Light SourceSolving Network Throughput Problems at the Diamond Light Source
Solving Network Throughput Problems at the Diamond Light Source
Jisc
 
Science DMZ security
Science DMZ securityScience DMZ security
Science DMZ security
Jisc
 
The Science DMZ
The Science DMZThe Science DMZ
The Science DMZ
Jisc
 
Electron Microscopy Between OPIC, Oxford and eBIC
Electron Microscopy Between OPIC, Oxford and eBICElectron Microscopy Between OPIC, Oxford and eBIC
Electron Microscopy Between OPIC, Oxford and eBIC
Jisc
 
Protecting our customers - BT security
Protecting our customers - BT securityProtecting our customers - BT security
Protecting our customers - BT security
Jisc
 
Data and information governance: getting this right to support an information...
Data and information governance: getting this right to support an information...Data and information governance: getting this right to support an information...
Data and information governance: getting this right to support an information...
Jisc
 
Cyber Crime - "Who, What and How"
Cyber Crime - "Who, What and How"Cyber Crime - "Who, What and How"
Cyber Crime - "Who, What and How"
Jisc
 
Role of the CISO in Higher Education
Role of the CISO in Higher EducationRole of the CISO in Higher Education
Role of the CISO in Higher Education
Jisc
 
Mitigation starts now
Mitigation starts nowMitigation starts now
Mitigation starts now
Jisc
 
Certifying and Securing a Trusted Environment for Health Informatics Research...
Certifying and Securing a Trusted Environment for Health Informatics Research...Certifying and Securing a Trusted Environment for Health Informatics Research...
Certifying and Securing a Trusted Environment for Health Informatics Research...
Jisc
 
Working with students and ISO27001
Working with students and ISO27001Working with students and ISO27001
Working with students and ISO27001
Jisc
 
Closing plenary and keynote from Lauren Sager Weinstein
Closing plenary and keynote from Lauren Sager WeinsteinClosing plenary and keynote from Lauren Sager Weinstein
Closing plenary and keynote from Lauren Sager Weinstein
Jisc
 

Viewers also liked (16)

110G networking within JASMIN
110G networking within JASMIN110G networking within JASMIN
110G networking within JASMIN
 
Challenges in end-to-end performance
Challenges in end-to-end performanceChallenges in end-to-end performance
Challenges in end-to-end performance
 
Provisioning Janet
Provisioning JanetProvisioning Janet
Provisioning Janet
 
Science DMZ at Imperial
Science DMZ at ImperialScience DMZ at Imperial
Science DMZ at Imperial
 
Solving Network Throughput Problems at the Diamond Light Source
Solving Network Throughput Problems at the Diamond Light SourceSolving Network Throughput Problems at the Diamond Light Source
Solving Network Throughput Problems at the Diamond Light Source
 
Science DMZ security
Science DMZ securityScience DMZ security
Science DMZ security
 
The Science DMZ
The Science DMZThe Science DMZ
The Science DMZ
 
Electron Microscopy Between OPIC, Oxford and eBIC
Electron Microscopy Between OPIC, Oxford and eBICElectron Microscopy Between OPIC, Oxford and eBIC
Electron Microscopy Between OPIC, Oxford and eBIC
 
Protecting our customers - BT security
Protecting our customers - BT securityProtecting our customers - BT security
Protecting our customers - BT security
 
Data and information governance: getting this right to support an information...
Data and information governance: getting this right to support an information...Data and information governance: getting this right to support an information...
Data and information governance: getting this right to support an information...
 
Cyber Crime - "Who, What and How"
Cyber Crime - "Who, What and How"Cyber Crime - "Who, What and How"
Cyber Crime - "Who, What and How"
 
Role of the CISO in Higher Education
Role of the CISO in Higher EducationRole of the CISO in Higher Education
Role of the CISO in Higher Education
 
Mitigation starts now
Mitigation starts nowMitigation starts now
Mitigation starts now
 
Certifying and Securing a Trusted Environment for Health Informatics Research...
Certifying and Securing a Trusted Environment for Health Informatics Research...Certifying and Securing a Trusted Environment for Health Informatics Research...
Certifying and Securing a Trusted Environment for Health Informatics Research...
 
Working with students and ISO27001
Working with students and ISO27001Working with students and ISO27001
Working with students and ISO27001
 
Closing plenary and keynote from Lauren Sager Weinstein
Closing plenary and keynote from Lauren Sager WeinsteinClosing plenary and keynote from Lauren Sager Weinstein
Closing plenary and keynote from Lauren Sager Weinstein
 

Similar to Archiving data from Durham to RAL using the File Transfer Service (FTS)

Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
Larry Smarr
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
inside-BigData.com
 
Network Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingNetwork Engineering for High Speed Data Sharing
Network Engineering for High Speed Data Sharing
Globus
 
Campus networking
Campus networkingCampus networking
Campus networking
Jisc
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
 
Science DMZ as a Service: Creating Science Super- Facilities with GENI
Science DMZ as a Service: Creating Science Super- Facilities with GENIScience DMZ as a Service: Creating Science Super- Facilities with GENI
Science DMZ as a Service: Creating Science Super- Facilities with GENI
US-Ignite
 
Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform
Larry Smarr
 
040419 san forum
040419 san forum040419 san forum
040419 san forum
Thiru Raja
 
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...The Pacific Research Platform: Building a Distributed Big Data Machine Learni...
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...
Larry Smarr
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and Flexibility
APNIC
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
DataWorks Summit
 
Future services on Janet
Future services on JanetFuture services on Janet
Future services on Janet
Jisc
 
The SKA Project - The World's Largest Streaming Data Processor
The SKA Project - The World's Largest Streaming Data ProcessorThe SKA Project - The World's Largest Streaming Data Processor
The SKA Project - The World's Largest Streaming Data Processor
inside-BigData.com
 
Pacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big DataPacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big Data
Larry Smarr
 
Low cost robotic tape library systems Using Open source Technology
Low cost robotic tape library systems Using Open source TechnologyLow cost robotic tape library systems Using Open source Technology
Low cost robotic tape library systems Using Open source Technology
Africa Open Science & Hardware
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and Ceremony
Archiver
 
Network research
Network researchNetwork research
Network research
Jisc
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
Larry Smarr
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
Larry Smarr
 

Similar to Archiving data from Durham to RAL using the File Transfer Service (FTS) (20)

Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 
Network Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingNetwork Engineering for High Speed Data Sharing
Network Engineering for High Speed Data Sharing
 
Campus networking
Campus networkingCampus networking
Campus networking
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
Science DMZ as a Service: Creating Science Super- Facilities with GENI
Science DMZ as a Service: Creating Science Super- Facilities with GENIScience DMZ as a Service: Creating Science Super- Facilities with GENI
Science DMZ as a Service: Creating Science Super- Facilities with GENI
 
Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform
 
040419 san forum
040419 san forum040419 san forum
040419 san forum
 
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...The Pacific Research Platform: Building a Distributed Big Data Machine Learni...
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and Flexibility
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
 
Future services on Janet
Future services on JanetFuture services on Janet
Future services on Janet
 
The SKA Project - The World's Largest Streaming Data Processor
The SKA Project - The World's Largest Streaming Data ProcessorThe SKA Project - The World's Largest Streaming Data Processor
The SKA Project - The World's Largest Streaming Data Processor
 
Pacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big DataPacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big Data
 
Low cost robotic tape library systems Using Open source Technology
Low cost robotic tape library systems Using Open source TechnologyLow cost robotic tape library systems Using Open source Technology
Low cost robotic tape library systems Using Open source Technology
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and Ceremony
 
Network research
Network researchNetwork research
Network research
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
 

More from Jisc

Adobe Express Engagement Webinar (Delegate).pptx
Adobe Express Engagement Webinar (Delegate).pptxAdobe Express Engagement Webinar (Delegate).pptx
Adobe Express Engagement Webinar (Delegate).pptx
Jisc
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Jisc's value to HE: the University of Sheffield
Jisc's value to HE: the University of SheffieldJisc's value to HE: the University of Sheffield
Jisc's value to HE: the University of Sheffield
Jisc
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
Jisc
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
Jisc
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
Jisc
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
Jisc
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
Jisc
 
International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...
Jisc
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptx
Jisc
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptx
Jisc
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...
Jisc
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
Jisc
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023
Jisc
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023
Jisc
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023
Jisc
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptx
Jisc
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptx
Jisc
 

More from Jisc (20)

Adobe Express Engagement Webinar (Delegate).pptx
Adobe Express Engagement Webinar (Delegate).pptxAdobe Express Engagement Webinar (Delegate).pptx
Adobe Express Engagement Webinar (Delegate).pptx
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Jisc's value to HE: the University of Sheffield
Jisc's value to HE: the University of SheffieldJisc's value to HE: the University of Sheffield
Jisc's value to HE: the University of Sheffield
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptx
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptx
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptx
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptx
 

Recently uploaded

Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 

Recently uploaded (20)

Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 

Archiving data from Durham to RAL using the File Transfer Service (FTS)

  • 1. Lydia Heck, Campus network engineering workshop 19/10/2016 Archiving data from Durham to RAL using the FileTransfer Service (FTS)
  • 2. 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 2 Archiving data from Durham to RAL using the File Transfer Service (FTS) Lydia Heck Institute for Computational Cosmology Manager of the DiRAC-2/2.5 Data Centric Facility COSMA
  • 3. 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 3 Introduction to DiRAC l DIRAC -- Distributed Research utilising Advanced Computing established in 2009 with DiRAC-1 • Support of research in theoretical astronomy, particle physics and nuclear physics • Funded by STFC with infrastructure money allocated from the Department for Business, Innovation and Skills (BIS) • The running costs, such as staff costs and electricity are funded by STFC • DiRAC is classed as a major research facility by STFC on a par with the big telescopes
  • 4. What is DiRAC l A national service run/managed/allocated by the scientists who do the science funded by BIS and STFC l The systems are built around and for the applications with which the science is done. l We do not rival a facility like ARCHER, as we do not aspire to run a general national service. 19 October 2016 4Campus Network Engineering for Data Intensive Science Workshop
  • 5. What is DiRAC – cont’d? l For the highlights of science carried out on the DiRAC facility please see: http://www.dirac.ac.uk/science.html l Specific example: Large scale structure calculations with the Eagle run 4096 cores ~8 GB RAM/core 47 days = 4,620,288 cpu hours 200 TB of data 19 October 2016 5Campus Network Engineering for Data Intensive Science Workshop
  • 6. The DiRAC computing systems 19 October 2016 6Campus Network Engineering for Data Intensive Science Workshop Blue Gene Edinburgh Cosmos Cambridge Complexity Leicester Data Centric Durham Data Analytic Cambridge
  • 7. COSMA @ DiRAC (Data Centric) Durham – Data Centric system –IBM IDataplex 6720 Intel Sandy Bridge cores 53.8 TB of RAM FDR10 infiniband 2:1 blocking 2.5 Pbyte of GPFS storage (2.2 Pbyte used!) 19 October 2016 7Campus Network Engineering for Data Intensive Science Workshop
  • 8. Resources of DiRAC l Long projects with significant amount of CPU hours allocated for 3 years typically on a specific system on one or more of the available 5 systems. Resources available: l l l l l 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 8 System cpu hours storage location Bluegene 98,304 cores 861 M 1 PB (GPFS) Edinburgh Data Centric 6720 Xeon cores 59 M 2.5 PB (GPFS) Durham (DiRAC2) Data Centric 8000 Xeon cores > 71 M 2.5 PB data (Lustre) 1.8 PB scratch (Lustre) Durham (DiRAC2.5) Complexity 4352 Xeon cores 38 M 0.8 PB (Panasas) Leicester Data Analytic 4800 Xeon cores 42 M 0.75 PB (Lustre) Cambridge SMP 1784 Xeon cores shared memory 15.6M 146 TB (EXT) Cambridge
  • 9. Why do we need to copy data ? During and when a project is completed copy data to home institutions l requires additional storage resource at researchers’ home institutions l Not enough provision – will require additional funds. Make backup copies l if disaster struck many cpu hours of calculations would be lost. Copy data to other sites to leverage compute resources for post processing. Storage on HPC facility runs out of capacity data creation considerably above expectation ? l 19 October 2016 9Campus Network Engineering for Data Intensive Science Workshop
  • 10. Why do we copy data to RAL ? Research data must now be available to interested parties for specified period of time l We could install DiRAC's own archive • requires funds and there is (currently) no budget We needed to get started: l to gain experience l to get a valid backup l to remove data as the resources run out l Identify bottlenecks and technical challenges Jeremy Yates (Director of DiRAC) negotiated access to the RAL archiving systems Set up collaborations and make use of previous experience and pool resources AND: copy data! l l 19 October 2016 10Campus Network Engineering for Data Intensive Science Workshop
  • 11. Network connectivity of Durham University • 2012 – upgrade to 4x1 Gbit to Janet • Janet advised to investigate optimal utilisation of available bandwith before applying for further upgrade • 2014 – upgrade to 6 Gbit to Janet • currently: 8 Gbit to Janet should be a full 10 Gbit by the end of the year – technical issues 19 October 2016 11Campus Network Engineering for Data Intensive Science Workshop
  • 12. network bandwidth – situation for Durham l 2014: Measured throughput ? l l 19 October 2016 12Campus Network Engineering for Data Intensive Science Workshop
  • 13. 2014: Measured Limits ? l l 19 October 2016 13Campus Network Engineering for Data Intensive Science Workshop
  • 14. September 2014 – Measured limits l l 19 October 2016 14Campus Network Engineering for Data Intensive Science Workshop
  • 15. Making optimal use of available bandwidth • planning and investment to by-pass the external campus firewall: • Prepartory work started in October/November 2014 two new routers (~£80k) – configured for throughput with minimal ACL enough to safeguard site. • deploying internal firewalls – part of new security infrastructure anyhow but essential for such a venture • security now relies on front-end systems of Durham DiRAC and Durham GridPP • IPPP was moved outside the firewall in April 2015 with a clear mandate to manage security for their installation. • The DiRAC Data Transfer system was moved outside about 1 month later. 19 October 2016 15Campus Network Engineering for Data Intensive Science Workshop
  • 16. GridPP Site FW config for endpoint node 19 October 2016 16Campus Network Engineering for Data Intensive Science Workshop GridFTP Port blocking GridFTP Pass thru GridFTP GridFTP Monitor w/fw GridFTP Bypass site fw
  • 17. Result for DiRAC and GridPP in Durham • guaranteed 3 Gbit/sec in/out • Consequences: • pushed the network performance for Durham GridPP from bottom 3 in the country to top 5 of the UK GridPP sites • Now they experience different bottlenecks, but they under their control • DiRAC data transfers achieve up to 300 – 400 Mbyte/sec throughput to RAL on archiving depending on file sizes. • faster data sharing with other collaboration sites • recently (October 2016) offered service to Earth Sciences with 70-80 MByte/sec from site in Switzerland • 19 October 2016 17Campus Network Engineering for Data Intensive Science Workshop
  • 18. Collaboration between DiRAC and GridPP/RAL l Durham Institute for Computational Cosmology (ICC) volunteered to be the prototype installation l Huge thanks to Jens Jensen and Brian Davies - there were many emails exchanged, many questions asked and many answers given. l Resulting document “Setting up a system for data archiving using FTS3” by Lydia Heck, Jens Jensen and Brian Davies 19 October 2016 18Campus Network Engineering for Data Intensive Science Workshop l https://www.cosma.dur.ac.uk/documentation
  • 19. Setting up the archiving tools l Identify appropriate hardware – could mean extra expense: need freedom to modify and experiment with  cannot have HPC users logged in and working when you need to reboot the system! l free to do very latest security updates  This might not always be possible on an HPC system l requires optimal connection to storage  For the transfer system this meant an infiniband card 19 October 2016 19Campus Network Engineering for Data Intensive Science Workshop
  • 20. Setting up the archiving tools l Create an interface to access the file/archving service at RAL using the GridPP tools • gridftp – Globus Toolkit – also provides Globus Connect • Trust anchors (egi-trustanchors) • voms tools (emi3-xxx) • fts3 (cern) 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 20
  • 21. 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 21 Chose to use FTS3 with GridFTP User submits transfer lists (and credentials) GPFS data.cosma.dur.ac.uk (GridFTP) CASTOR-GEN srm-dirac.gridpp.rl.ac.uk (SRM) GridFTP FTS3
  • 22. Learning to use certificates and proxies  l long-lived voms proxy? l myproxy-init; myproxy-logon; voms-proxy-init; fts-transfer- delegation l How to create a proxy and delegation that lasts weeks even months? l This is still an issue for a voms proxy. But circumvented it using normal proxy. l grid-proxy-init; fts-transfer-delegation l grid-proxy-init –valid HH:MM l fts-transfer-delegation –e time-in-seconds l creates proxy that lasts up to certificate life time. 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 22
  • 23. Experiences 1. Large files – optimal throughput limited by network bandwidth 2. Many small files – limited by latency 3. many parallel sessions: impedes on proper functioning of archive server. 4. Ownership, creation dates not preserved – one grid owner 5. Simple approach of “just” pushing files will not work! 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 23
  • 24. Actions to overcome issues • tar files up in chunks - ~256 Gbyte • exclude checked out versioning subdirectories • preserves ownership, and time stamps in the tar archive • keep record of archived files • Files to transfer are large – limited by bandwidth, not by latency 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 24
  • 25. Open issues l depends on single admin to carry out. Not automatic. l what happens when content in directories change? – complete new archive sessions? l Create a tool more like rsync – requires extensive scripting l When trying to get data back, get back all of a subset, to find single or string of files 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 25
  • 26. Conclusions l With the right network speed we can archive the DiRAC data to RAL or anywhere else with the right tools and connectivity. l Documenting the procedure is very important to transfer the knowledge and duplicating effort. The documentation is online https://www.cosma.dur.ac.uk/documentation l Each DiRAC site should have their own dirac0X account l Start with and keep on archiving – this is more difficult as it is not completely automatic yet and more development is required. l Collaboration between DiRAC and GridPP/RAL DOES work! l The work has been of benefit to other transfer actions, which significantly helps research and reflects well on the service we can deliver. l Can we aspire to more? 19 October 2016 Campus Network Engineering for Data Intensive Science Workshop 26

Editor's Notes

  1. 2