This document summarizes a presentation on supporting data intensive applications. It discusses the Janet end-to-end performance initiative which aims to engage with data intensive research communities to help optimize performance. Some key points include:
- Seeing increasing data intensive science applications and remote computation scenarios requiring high bandwidth.
- Importance of understanding researcher requirements and setting expectations on practical throughput limits.
- Using perfSONAR to measure network characteristics and identify performance issues between sites on the Janet network.
- Adopting the "Science DMZ" model of separating research and campus traffic to avoid bottlenecks and optimize data transfer performance.
2. Please switch your mobile phones to silent
17:30 -
19:00
No fire alarms scheduled. In the event of an
alarm, please follow directions of NCC staff
Exhibitor showcase and drinks reception
18:00 -
19:00 Birds of a feather sessions
4. Why is end-to-end performance important?
» Overall goal: help optimise a siteâs use of its Janet connectivity
» Seeing a growth in data-intensive science applications
âș Includes established areas like GridPP (moving LHC data around)
âș As well as new areas like cryo-electron microscopy
» Seeing an increasing number of remote computation scenarios
âș e.g., scientific networked equipment, no local compute
âș Might require 10Gbit/s to remote compute to return computation results on a 100GB
data set to a researcher for timely visualisation
» Starting to see more 100Gbit/s connectivity requests
âș Likely to have challenging data transfer requirements behind them
» As networking people, how do we help our researchers and their applications?
11/04/2017 Janet end-to-end performance initiative update
5. Speaking to your researchers
» Are you or your computing service department speaking to your researchers?
âș If not, how do you understand their data-intensive requirements?
âș If so, is this happening on a regular basis?
âș Ideally youâd want to be able to plan ahead, rather than adapt on the fly
» Do you conduct networking âfuture looksâ?
âș Any step changes might dwarf your siteâs organic growth
âș This issue should have some attention at CIO or PVC Research level
» Do you know what your application elephants are?
âș Whatâs the breakdown of your siteâs network traffic?
âș How are you monitoring network flows?
11/04/2017 Janet end-to-end performance initiative update
6. Researcher expectations
» How can we help set and manage researcher expectations?
» One aspect is helping them articulate their network requirements
âș Volume of data in time X => data rate required
» Itâs also about understanding practical limitations
âș Can determine theoretical network throughput
âș e.g. in principle, you can transfer 100TB over a 10Gbit/s link in 1 day
âș But in practice, many factors may prevent this
» We should encourage researchers to speak to their computing service
» Computing services can in turn talk to the Janet Service Desk
âș jisc.ac.uk/contact
» And noting here that cloud capabilities are becoming increasingly important
11/04/2017 Janet end-to-end performance initiative update
7. Janet end-to-end performance initiative
» This is the context in which Jisc set up the Janet end-to-end performance initiative
» The goals of the initiative include:
âș Engaging with existing data-intensive research communities and identifying emerging
communities
âș Creating dialogue between Jisc, computing service groups, and research communities
âș Holding workshops, facilitating discussion on e-mail lists, etc.
âș Helping researchers manage expectations
âș Establishing and sharing best practices in identifying and rectifying causes of poor
performance
» More information:
âș jisc.ac.uk/rd/projects/janet-end-to-end-performance-initiative
11/04/2017 Janet end-to-end performance initiative update
8. Understanding the factors affecting E2E
» Achieving optimal end-to-end performance is a multi-faceted problem.
» It includes:
âș Appropriate provisioning between the end sites
âș Properties of the local campus network (at each end), including capacity of the Janet
connectivity, internal LAN design, the performance of firewalls and the configuration of
other devices on the path
âș End system configuration and tuning; network stack buffer sizes, disk I/O, memory
management, etc.
âș The choice of tools used to transfer data, and the underlying network protocols
» To optimise end-to-end performance, you need to address each aspect
11/04/2017 Janet end-to-end performance initiative update
9. Janet network engineering
» From Jiscâs perspective, itâs important to ensure there is sufficient capacity in the network
for its connected member sites
» We perform an ongoing review of network utilisation
âș Provision the backbone network and regional links
âș Provision external connectivity, to other NRENs and networks
âș Observe the utilisation, model growth, predict ahead
âș Step changes have bigger impact at the local rather than backbone scale
» Janet has no differential queueing for regular IP traffic
âș The Netpath service exists for dedicated / overlay links
âș In general, Jisc plans regular network upgrades with a view to ensuring that there
is sufficient latent capacity in the network
11/04/2017 Janet end-to-end performance initiative update
11. Major external links, October 2016
11/04/2017 Janet end-to-end performance initiative update
12. E2EPI site visits
» We (Duncan Rand and I) have visited a dozen or so sites
âș Met with a variety of networking staff and researchers
âș And spoken to many others via email
» Really interesting to hear what sites are doing
âș Some good practice evident, especially in local network engineering
âș e.g. routing those elephants around main campus firewalls
âș Campus firewalls often not designed for single high throughput flows
» Seeing varying use of site links
âș e.g. 10G for campus, 10G resilient, 20G for research (GridPP)
âș Some sites using their âresilientâ link for bulk data transfers
» Some rate limiting of researcher traffic
âș To avoid adverse impact on general campus traffic
âș Weâd encourage sites to talk to the JSD rather than rate limit
11/04/2017 Janet end-to-end performance initiative update
13. The Science DMZ model
» ESnet published the Science DMZ âdesign patternâ in 2012/13
âș es.net/assets/pubs_presos/sc13sciDMZ-final.pdf
» Three key elements:
âș Network architecture; avoiding local bottlenecks
âș Network performance measurement
âș Data transfer node (DTN) design and configuration
» Also important to apply your security policy without impacting performance
» The NSFâs Cyberinfrastructure (CC*) Program funded this model in over 100 US
universities:
âș See nsf.gov/funding/pgm_summ.jsp?pims_id=504748
» No current funding equivalent in the UK; itâs down to individual campuses to
fund changes to network architectures for data-intensive science
âș But this can and should be part of your network architecture evolution
11/04/2017 Janet end-to-end performance initiative update
14. Good news â youâre doing a lot already
» There are several examples of sites in the UK that
have a form of Science DMZ deployment
» In many cases the deployments were made
without knowledge of the Science DMZ model
» But Science DMZ is simply a set of good principles
to follow, so itâs not surprising that some Janet
sites are already doing it
» Examples in the UK:
âș Diamond Light Source (more from Alex next)
âș JASMIN/CEDA DataTransfer Zone
âș Imperial College GridPP; supports up to 40Gbit/s
of IPv4/IPv6
âș To realise the benefit, both end sites need to
apply the principles
11/04/2017 Janet end-to-end performance initiative update
15. Examples of campus network engineering
» In principle you can just use your Janet IP service
» Where specific guarantees are required, the Netpath Plus service is available
âș See jisc.ac.uk/netpath
âș But then youâll not be able to exceed that capacity
» Some sites split their campus links, e.g. 10G campus, 10G science
âș Again, be careful about using your âresilientâ link for bulk data
âș Itâs better to speak to the JSD about a primary link upgrade
âș And ensure appropriate resilience for that
» Some sites rate-limit research traffic
» Some examples of physical / virtual overlays
âș The WLCG (for LHC data) deployed OPN (optical) and LHCONE (virtual) networks
âș Not clear that one overlay per research community would scale
» At least one site is exploringCisco ACI (their SDN solution)
11/04/2017 Janet end-to-end performance initiative update
16. Measuring network characteristics
» Itâs important to have telemetry on your network
» The Science DMZ model recommends perfSONAR for this
âș More from Alex and Szymon later in this session
âș Current version about to be 4.0 (as of April 17th, all being well ï)
» perfSONAR uses proven measurement tools under the hood
âș e.g. iperf and owamp
» Can run between two perfSONAR systems or build a mesh
» Collects telemetry over time; throughput, loss, latency, traffic path
» Helps you assess the impact of changes to your network or systems
âș And to understand variance in characteristics over time
» It can highlight poor performance, but doesnât troubleshoot per se
11/04/2017 Janet end-to-end performance initiative update
18. Janet perfSONAR / DTN test node(s)
» Weâve installed a 10G perfSONAR node at a Janet PoP in London
âș Lets you test your siteâs throughput to/from Janet backbone
âș Might be useful to you if you know youâll want to run some data-intensive applications
in the near future, but donât yet have perfSONAR at the far end, or if you just want to
benchmark your siteâs connectivity
âș Ask us if youâre interested in using it
» Weâre planning to add a second perfSONAR test node in our Slough DC
âș Also planning to install a 10G reference SSD-based DTN there
âș See https://fasterdata.es.net/science-dmz/DTN/
âș This will allow disk-to-disk tests, using a variety of transfer tools
» We can also run a perfSONAR mesh for you, using MaDDash on aVM
» We may also deploy an experimental perfSONAR node
âș e.g. to evaluate the new GoogleTCP-BBR implementation
11/04/2017 Janet end-to-end performance initiative update
19. Small node perfSONAR
» In some cases, just an indicative perfSONAR test is useful
» i.e., run loss/latency tests as normal, but limit throughout tests to 1Gbit/s
» For this scenario, you can build a small node perfSONAR system for under £250
» Jisc took part in the GEANT small node pilot project, using Gigabyte Brix:
âș IPv4 and IPv6 test mesh at http://perfsonar-smallnodes.geant.org/maddash-webui/
» We now have a device build that we can offer to communities for testing
âș Aim to make them as âplug and playâ as possible
âș And a stepping stone to a full perfSONAR node
» Further information andTNC2016 meeting slide deck:
âș https://lists.geant.org/sympa/d_read/perfsonar-smallnodes/
11/04/2017 Janet end-to-end performance initiative update
20. perfSONAR small node test mesh
11/04/2017 Janet end-to-end performance initiative update
21. Aside: GoogleâsTCP-BBR
» TraditionalTCP performs poorly even with just a fraction of 1% loss rate
» Google have been developing a new version ofTCP
âș TCP-BBR was open-sourced last year
âș Requires just sender-side deployment
âș Seeks high throughput with a small queue
âș Good performance at up to 15% loss
âș Google using it in production today
» Would be good to explore this further
âș Understand impact on otherTCP variants
âș And when used for parallelisedTCP applications like GridFTP
» See the presentation from the March 2017 IETF meeting:
âș etf.org/proceedings/98/slides/slides-98-iccrg-an-update-on-bbr-congestion-control-
00.pdf
11/04/2017 Janet end-to-end performance initiative update
22. Building on Science DMZ?
» We should seek to establish good principles and practices at all campuses
âș And the research organisations they work with, like Diamond
âș Thereâs already a good foundation at many GridPP sites
» The Janet backbone is heading towards 600G capacity and beyond
» We can seed further communities of good practice on this foundation
âș e.g. the DiRAC HPC community, the SES consortium, âŠ
» And grow a Research DataTransfer Zone (RDTZ) within and between campuses
âș Build towards a UK RDTZ
âș Inspired by the US Pacific Research Platform model of multi-site, multi-discipline
research-driven collaboration built on NSF Science DMZ investment
» Many potential benefits, such as enabling new types of workflow
âș e.g. streaming data to CPUs without the need to store locally
11/04/2017 Janet end-to-end performance initiative update
23. Transfer tools
» Your researchers are likely to find many available data transfer tools:
» Thereâs the simpler old friends like ftp and scp
âș But these are likely to give a bad initial impression of what your network can do
» ThereâsTCP-based tools designed to mitigate the impact of packet loss
âș GridFTP typically uses four parallelTCP streams
» Globus Connect is free for non-profit research and education use
âș See globus.org/globus-connect
» Thereâs tools to support management of transfers
âș FTS â see http://astro.dur.ac.uk/~dph0elh/documentation/transfer-data-to-ral-v1.4.pdf
» Thereâs also a commercial UDP-based option, Aspera
âș See asperasoft.com/
» It would be good to establish more benchmarking of these tools at Janet campuses
11/04/2017 Janet end-to-end performance initiative update
24. E2E performance to cloud compute
» Weâre seeing growing interest in the use of commercial cloud compute
âș e.g. to provide remote CPU for scientific equipment
» Complements compute available at the new ESPRCTier-2 HPC facilities
âș epsrc.ac.uk/research/facilities/hpc/tier2/
» Anecdotal reports of 2-3Gbit/s into AWS
âș e.g. by researchers at the Institute of Cancer Research
âș See presentations at RCUK CloudWorkshop - https://cloud.ac.uk/
» Bandwidth for AWS depends on theVM size
âș See https://aws.amazon.com/ec2/instance-types/
» Weâre keen to explore cloud compute connectivity further
âș Includes AWS, MS ExpressRoute, âŠ
âș And scaling Janet connectivity to these services as appropriate
11/04/2017 Janet end-to-end performance initiative update
25. Future plans for E2EPI
» Our future plans include:
âș Writing up and disseminating best practice case studies
âș Growing a UK RDTZ by promoting such best practices within communities
âș Deploying a second 10G Janet perfSONAR test node at our Slough DC
âș Deploying a 10G reference DTN at Slough and performing transfer tool benchmarking
âș Promoting wider campus perfSONAR deployment and community meshes
âș Integrating perfSONAR data with router link utilisation
âș Developing our troubleshooting support further
âș Experimenting with GoogleâsTCP-BBR
âș Exploring best practices in implementing security models for Science DMZ
âș Expanding Science DMZ to include IPv6, SDN and other technologies
âș Running a second community E2EPI workshop in October
âș Holding a hands-on perfSONAR training event (after 4.0 is out)
11/04/2017 Janet end-to-end performance initiative update
30. Diamond
»The Diamond machine is a type of particle accelerator
»CERN: high energy particles smashed together, analyse
the crash
»Diamond: exploits the light produced by high energy
particles undergoing acceleration
»Use this light to study matter â like a âsuper microscopeâ
What is the Diamond Light Source?
perfSONAR and Science DMZ at Diamond
31. Diamond
»In the Oxfordshire countryside, by the A34 near Didcot
»Diamond is a not-for-profit joint venture between STFC
and the WellcomeTrust
»Cost of use: Free access for scientists through a
competitive scientific application process
»Over 7000 researchers from academia and industry have
used our facility
What is the Diamond Light Source?
perfSONAR and Science DMZ at Diamond
32. The Machine
»Three particle accelerators:
âș Linear accelerator
âș Booster Synchrotron
âș Storage ring
â 48 straight sections angled
together to make a ring
â 562m long
â could be called a
âtetracontakaioctagonâ
perfSONAR and Science DMZ at Diamond
34. The Science
»Bright beams of light from
each bending magnet or
wiggler are directed into
laboratories known as
âbeamlinesâ.
»We also have several cutting-
edge electron microscopes.
»All of these experiments
operate concurrently!
»Spectroscopy
»Crystallography
»Tomography (think CAT Scan)
»Infrared
»X-ray absorption
»X-ray scattering
perfSONAR and Science DMZ at Diamond
36. Data Rates
»Central Lustre and GPFS filesystems for science data:
âș 420TB, 900TB, 3.3PB (as of 2016)
»Typical x-ray camera: 4MB frame, 100x per second
»An experiment can easily produce 300GB-1TB
»Scientists want to take their data home
Data-intensive research
perfSONAR and Science DMZ at Diamond
37. Data Rates
»Each dataset might only be downloaded once
»I donât know where my users are
»Some scientists want to push data back to Diamond
Data-intensive problems
perfSONAR and Science DMZ at Diamond
39. Network Limits
»Science data downloads from Diamond to visiting usersâ
institutes were inconsistent and slow
âș âŠeven though the facility had a â10Gb/sâ JANET
connection from STFC.
»The limit on download speeds was delaying post-
experiment analysis at usersâ home institutes.
The Problem
perfSONAR and Science DMZ at Diamond
40. Characterising the problem
»Our target: âa stable 50Mb/s
over a 10ms pathâ
âș Using our siteâs shared
JANET connection
âș In real terms, 10ms was
approximately DLS to
Oxford
perfSONAR and Science DMZ at Diamond
41. Baseline findings
»Inside our network: 10Gb/s, no packet loss
»Over the STFC/JANET segment between Diamondâs edge
and the Physics Department at Oxford:
âș low and unpredictable speeds
âș a small amount of packet loss
Baseline findings with iperf
perfSONAR and Science DMZ at Diamond
42. Packet Loss
TCPThroughput =
»MSS (Packet Size)
»Latency (RTT)
»Packet Loss probability
TCP Performance is predicted by the Mathis equation
perfSONAR and Science DMZ at Diamond
44. Packet Loss
»According to Mathis, to
achieve our initial goal over a
10ms path, the tolerable
packet loss is:
âș 0.026% (maximum)
perfSONAR and Science DMZ at Diamond
45. Finding the problem in the Last Mile
»We worked with STFC to connect a perfSONAR server
directly to the main Harwell campus border router to look
for loss in the âlast mileâ
perfSONAR and Science DMZ at Diamond
47. Security
»Data intensive science traffic interacts poorly with
enterprise firewalls
»Does this mean we can use the Science DMZ idea to justâŠ
ignore security?
âș No!
Aside: Security without Firewalls
perfSONAR and Science DMZ at Diamond
48. Security
»Implementing a Science DMZ means segmenting your
network traffic
âș Apply specific controls to data transfer hosts
âș Avoid unnecessary controls
Science DMZ as a Security Architecture
perfSONAR and Science DMZ at Diamond
49. Security
»Only run the services you need
»Protect each host:
âș Router ACLs
âș On-host firewall â Linux iptables is performant
Techniques to secure Science DMZ hosts
perfSONAR and Science DMZ at Diamond
50. Globus GridFTP
perfSONAR and Science DMZ at Diamond
»We adopted Globus GridFTP as
our recommended transfer
method
âș It uses parallelTCP streams
âș It retries and resumes
automatically
âș It has a simple, web-based
interface
52. Diamond Science DMZ
»Test data: 2Gb/s+ consistently
between Diamond and the
ESnet test point at
Brookhaven Labs, NewYork
State, USA
Speed records!
»Biggest: Electron Microscope
data from DLS to Imperial
âș 1120GB at 290Mb/s
»Fastest: Crystallography
dataset from DLS to
Newcastle
âș 260GB at 480Mb/s
Actual transfers:
perfSONAR and Science DMZ at Diamond
53. Bad bits
»Globusâ logs
âș Netflow
»Globus install
»My own use of perfSONAR
Bad bits
perfSONAR and Science DMZ at Diamond
55. Near Future
»10Gb+ performance
âș Globus Cluster for multiple 10Gb links
âș 40Gb single links
Near Future
perfSONAR and Science DMZ at Diamond
56. Thank you!
»Use real world testing
»Never believe your vendor
»Zero packet loss is crucial
»Enterprise firewalls introduce
packet loss
AlexWhite
Diamond Light Source
alex.white@diamond.ac.uk
perfSONAR and Science DMZ at Diamond
58. Essentials of the Modern
Performance Monitoring with
perfSONAR
SzymonTrocha, PSNC / GĂANT, szymon.trocha@psnc.pl
11 April 2017
59. Motivations
»Identify problems, when they
happen or (better) earlier
»The tools must be available (at
campus endpoints,
demarcations between
networks, at exchange points,
and near data resources such
as storage and computing
elements, etc)
»Access to testing resources
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
60. Problem statement
» The global Research & Education network ecosystem is comprised of hundreds of
international, national, regional and local-scale networks
» While these networks all interconnect, each network is owned and operated by separate
organizations (called âdomainsâ) with different policies, customers, funding models,
hardware, bandwidth and configurations
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
» This complex, heterogeneous set of networks must operate seamlessly from âend to endâ
to support your science and research collaborations that are distributed globally
61. Where AreThe (multidomain) Problems?
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
Source
Campus
S
Congested or faulty links
between domains
Congested intra-
campus links
D
Destination
Campus
Latency dependant problems inside
domains with small RTT
62. Challenges
»Delivering end-to-end performance
âș Get the user, service delivery teams, local campus and
metro/backbone network operators working together
effectively
âHave tools in place
âKnow your (network) expectations
âBe aware of network troubleshooting
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
63. What is perfSONAR?
» Itâs infeasible to perform at-scale data movement all the time â as we see in other
forms of science, we need to rely on simulations
» perfSONAR is a tool to:
âș Set network performance expectations
âș Find network problems (âsoft failuresâ)
âș Help fix these problems
âș All in multi-domain environments
» These problems are all harder when multiple networks are involved
» perfSONAR is provides a standard way to publish monitoring data
» This data is interesting to network researchers as well as network operators
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
64. TheToolkit
» Network performance comes down to a couple of key metrics:
âș Throughput (e.g. âhow much can I get out of the networkâ)
âș Latency (time it takes to get to/from a destination)
âș Packet loss/duplication/ordering (for some sampling of packets, do they all make it to the other side without serious abnormalities
occurring?)
âș Network utilization (the opposite of âthroughputâ for a moment in time)
» We can get many of these from a selection of measurement tools â the perfSONAR Toolkit
» The âperfSONAR Toolkitâ is an open source implementation and packaging of the perfSONAR
measurement infrastructure and protocols
» All components are available as RPMs, DEBs, and bundled as a CentOS ISO
» Very easy to install and configure (usually takes less than 30 minutes for default install)
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
65. Importance of RegularTesting
» We canât wait for users to report problems and then fix them
» Things just break sometimes
âș Bad system or network tuning
âș Failing optics
âș Somebody messed around in a patch panel and kinked a fiber
âș Hardware goes bad
» Problems that get fixed have a way of coming back
âș System defaults come back after hardware/software upgrades
âș New employees may not know why the previous employee set things up a certain way and back
out fixes
» Important to continually collect, archive, and alert on active test results
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
66. perfSONAR Deployment Possibilities
»Small node»Dedicated server
âș A singleCPU with multiple cores
(2.7 GHz for 10Gps tests)
âș 4GB RAM
âș 1Gps onboard (mgmt + delay)
âș 10Gps PCI-slot NIC (throughput)
»Low cost â small PC
âș e.g. GIGABYTE BRIX GB-BACE-
3150
Various models
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
»Ad hoc testing
âș Docker
68. Location criteria
»Where it can be integrated
into the facility
software/hardware
management systems?
»Where it can do the most good
for the network operators or
users?
»Where it can do the most good
for the community?
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
EDGE
NEXTTO
SERVICES
69. Distributed Deployment In a Few Steps
Choose your
home
âą Connect to
network
InstallToolkit
software
âą By site
administrator
Configure
hosts
âą Networking
âą 2 interfaces
âą Visibility
Point to
central host
âą To consume
central mesh
configuration
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
Install and
configure
âą By mesh
administrator
âą Central data
storage
âą Dashboard GUI
âą Home for mesh
configuration
Configure
mesh
âą Who, what and
when
âą Every 6 hours
(bandwidth)
âą Be careful 10G ->
1 G
Publish mesh
configuration
âą To be consumed
by measurement
hosts
Run dashboard
âą Observe
thresholds
âą Look for errors
HOSTSCENTRALSERVER
70. New 4.0 release announcement
» Introduction of pScheduler
âș New software for measurements to replace BWCTL
» GUI changes
âș More information, better presentation
» OS support upgrade
âș Debian 7, Ubuntu 14
âș Centos7
â Still supportingCentos6
» Mesh config GUI
» Maddash alerting
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
April 17th
73. Thank you!
szymon.trocha@psnc.pl
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
âą http://www.perfsonar.net/
âą http://docs.perfsonar.net/
âą http://www.perfsonar.net/about/getting-help/
âą perfSONAR videos
âą https://learn.nsrc.org/perfsonar
Hands-on
course in
London is
comingâŠ
Each dataset might only be downloaded once
Users might only want a small section of their data â over the course of an hour of data collection, perhaps only the last five minutes were useful
Does not make sense to push every byte to a CDN
I donât know where my users are
Some organisations have federated agreements with partner institutions âdark fibre, or VPNs over the Internet. I however have users performing ad-hoc access from anywhere.s
I canât block segments of the world â we have users from Russia, China etc
Some scientists want to push data back to Diamond
Reprocessing of experiments with new techniques
New facilities located outside Diamond using our software analysis stack
When I started at Diamond, this is how users moved large amounts of data offsite.
Itâs a dedicated machine called a âdata dispenserâ
itâs actually a member of the Lustre and GPFS parallel filesystems
it speaks USB3
Itâs an embarassment
We started testing Diamondâs own network by using cheap servers with perfSONAR, to run AD-HOC tests.
First we checked them back-to-back that they could do 10Gb, then we tested
deep inside the Science network next to the storage servers
at the edge of the Diamond network, where we link to STFC
Then we used perfSONARâs built-in community list to find a third server just up the road at the University of Oxford to test against
Packet loss? Is that a problem? In my experience, on the local area, no, it's never concerned me before.
This was the attitude of my predecessors also.
There are three major factors that affect TCP performance. All three are interrelated.
The âMathisâ equation predicts the maximum transfer speed of a TCP link.
TCP has the concept of a âWindow Sizeâ, which describes the amount of data that can be in flight in a TCP connection between ACK packets.
The sending system will send a TCP window worth of data, and then wait for an acknowledgement from the receiver.
At the start of a transfer, TCP begins with a small window size, which it ramps up in size as more and more data is successfully sent. Youâve heard of this before; this is just Bandwidth Delay product calculations for TCP Performance over Long Fat Networks.
However, BDP calculations for a Long Fat network assume a lossless path, and in the real world, we donât have lossless paths.
When a TCP stream encounters packet loss, it has to recover. To do this, TCP rapidly decreases the window size for the amount of data in flight. This sounds obvious, but the effect of spurious packet loss on TCP is pretty profound.
With everything being equal, the time for a TCP connection to recover from loss goes up as the latency goes up.
The packet loss between Diamond and Oxford was too small to previously cause concern to network engineers, but was found to be large enough to disrupt high-speed transfers over distances greater than a few miles.
So... once we had an idea what the problem was, instead of just complaining at them, I could take actual fault data to my upstream supplier.
Phil at STFC was gracious in working with us to allow us to connect up another PerfSonar server.
Testing with this server let us run traffic just through the firewall, which showed us that the packet loss problem was with the firewall, and not with the JANET connection between us and Oxford
âScience DMZâ (an idea from ESnet) is otherwise known as âmoving your data transfer server to the edge of your networkâ
Your JANET connection has high latency, but hopefully no loss.
If your local LAN has an amount of packet loss, you might still see reasonable speed over it because the latency on the LAN is so small.
The Science DMZ model puts your application server between the long and short latency parts, bridging them. Essentially, youâre splitting the latency domains.
This section is cribbed from an utterly excellent presentation by Kate Petersen Mace from Esnet
Science DMZ is about reducing degrees of freedom and removing the number of network devices in the critical path.
Removing redundant devices, methods of entry etc, is a positive enhancement to security.
The typical corporate LAN is a wild-west filled with devices that you didnât do the engineering on yourself:
Network printers
Web browsers
Financial databases
In my case, modern oscilloscopes running Windows 95âŠ
That cool touch-screen display kiosk in reception
Windows devices
In comparison to the corporate LAN, the Science DMZ is a calm, relaxed place.
You know precisely what services youâre moving to the data transfer hosts.
Youâre typically only exposing one service to the Internet â FTP for example, or Aspera.
You leave the enterprise firewall to do its job â that of preventing access to myriad unknown ports, and you use a more appropriate tool to secure each side.
Linux iptables will happily do port ACLs on the host at 10Gb with no real hit on the CPU.
This means that if a packet gets dropped in one stream, the other streams carry on the transfer while the affected one recovers.
This is how we implemented the Science DMZ at diamond
You can see the Data Transfer node, connected directly to the campus site front door routers
There is a key question on how do you get your data to your Data Transfer node?
Move data to it on a hard drive
Route your data through your lossy firewall front door â itâs low latency, so Mathis says this should still be quick
We went a step further and put in a private, non-routable fibre link to a server in the datacentre.
The internal server has an important second function â Authentication of access to data. I didnât want to put an AD server in the Science DMZ if I didnât need to, so this internal server is running the Authentication server for Globus Online from inside our corporate network
I tend to use perfSONAR just for ad-hoc testing. Itâs fabulous for this, and I can direct my users to it for self-service testing of their network connections.
I know it is very useful for periodic network monitoring, but I donât take advantage of that yet
The common implementation of SCP has a fixed TCP window size. It will never grow big enough to fill a fast link.
Aspera â UDP based system, commercial.
never used it, Iâve heard some people from various corners of science talk about it, but Iâve never been asked for it. Iâm sure it would do well on the Science DMZ if we ever need Aspera.
- globus online default stream splitting -> per file or chunks?