SlideShare a Scribd company logo
1 of 74
Parallel Session A:
Supporting data
intensive applications
Chair:Tim Chown
Please switch your mobile phones to silent
17:30 -
19:00
No fire alarms scheduled. In the event of an
alarm, please follow directions of NCC staff
Exhibitor showcase and drinks reception
18:00 -
19:00 Birds of a feather sessions
Janet end-to-end
performance initiative update
Tim Chown, Jisc
Why is end-to-end performance important?
» Overall goal: help optimise a site’s use of its Janet connectivity
» Seeing a growth in data-intensive science applications
â€ș Includes established areas like GridPP (moving LHC data around)
â€ș As well as new areas like cryo-electron microscopy
» Seeing an increasing number of remote computation scenarios
â€ș e.g., scientific networked equipment, no local compute
â€ș Might require 10Gbit/s to remote compute to return computation results on a 100GB
data set to a researcher for timely visualisation
» Starting to see more 100Gbit/s connectivity requests
â€ș Likely to have challenging data transfer requirements behind them
» As networking people, how do we help our researchers and their applications?
11/04/2017 Janet end-to-end performance initiative update
Speaking to your researchers
» Are you or your computing service department speaking to your researchers?
â€ș If not, how do you understand their data-intensive requirements?
â€ș If so, is this happening on a regular basis?
â€ș Ideally you’d want to be able to plan ahead, rather than adapt on the fly
» Do you conduct networking “future looks”?
â€ș Any step changes might dwarf your site’s organic growth
â€ș This issue should have some attention at CIO or PVC Research level
» Do you know what your application elephants are?
â€ș What’s the breakdown of your site’s network traffic?
â€ș How are you monitoring network flows?
11/04/2017 Janet end-to-end performance initiative update
Researcher expectations
» How can we help set and manage researcher expectations?
» One aspect is helping them articulate their network requirements
â€ș Volume of data in time X => data rate required
» It’s also about understanding practical limitations
â€ș Can determine theoretical network throughput
â€ș e.g. in principle, you can transfer 100TB over a 10Gbit/s link in 1 day
â€ș But in practice, many factors may prevent this
» We should encourage researchers to speak to their computing service
» Computing services can in turn talk to the Janet Service Desk
â€ș jisc.ac.uk/contact
» And noting here that cloud capabilities are becoming increasingly important
11/04/2017 Janet end-to-end performance initiative update
Janet end-to-end performance initiative
» This is the context in which Jisc set up the Janet end-to-end performance initiative
» The goals of the initiative include:
â€ș Engaging with existing data-intensive research communities and identifying emerging
communities
â€ș Creating dialogue between Jisc, computing service groups, and research communities
â€ș Holding workshops, facilitating discussion on e-mail lists, etc.
â€ș Helping researchers manage expectations
â€ș Establishing and sharing best practices in identifying and rectifying causes of poor
performance
» More information:
â€ș jisc.ac.uk/rd/projects/janet-end-to-end-performance-initiative
11/04/2017 Janet end-to-end performance initiative update
Understanding the factors affecting E2E
» Achieving optimal end-to-end performance is a multi-faceted problem.
» It includes:
â€ș Appropriate provisioning between the end sites
â€ș Properties of the local campus network (at each end), including capacity of the Janet
connectivity, internal LAN design, the performance of firewalls and the configuration of
other devices on the path
â€ș End system configuration and tuning; network stack buffer sizes, disk I/O, memory
management, etc.
â€ș The choice of tools used to transfer data, and the underlying network protocols
» To optimise end-to-end performance, you need to address each aspect
11/04/2017 Janet end-to-end performance initiative update
Janet network engineering
» From Jisc’s perspective, it’s important to ensure there is sufficient capacity in the network
for its connected member sites
» We perform an ongoing review of network utilisation
â€ș Provision the backbone network and regional links
â€ș Provision external connectivity, to other NRENs and networks
â€ș Observe the utilisation, model growth, predict ahead
â€ș Step changes have bigger impact at the local rather than backbone scale
» Janet has no differential queueing for regular IP traffic
â€ș The Netpath service exists for dedicated / overlay links
â€ș In general, Jisc plans regular network upgrades with a view to ensuring that there
is sufficient latent capacity in the network
11/04/2017 Janet end-to-end performance initiative update
Janet backbone, October 2016
11/04/2017 Janet end-to-end performance initiative update
Major external links, October 2016
11/04/2017 Janet end-to-end performance initiative update
E2EPI site visits
» We (Duncan Rand and I) have visited a dozen or so sites
â€ș Met with a variety of networking staff and researchers
â€ș And spoken to many others via email
» Really interesting to hear what sites are doing
â€ș Some good practice evident, especially in local network engineering
â€ș e.g. routing those elephants around main campus firewalls
â€ș Campus firewalls often not designed for single high throughput flows
» Seeing varying use of site links
â€ș e.g. 10G for campus, 10G resilient, 20G for research (GridPP)
â€ș Some sites using their “resilient” link for bulk data transfers
» Some rate limiting of researcher traffic
â€ș To avoid adverse impact on general campus traffic
â€ș We’d encourage sites to talk to the JSD rather than rate limit
11/04/2017 Janet end-to-end performance initiative update
The Science DMZ model
» ESnet published the Science DMZ “design pattern” in 2012/13
â€ș es.net/assets/pubs_presos/sc13sciDMZ-final.pdf
» Three key elements:
â€ș Network architecture; avoiding local bottlenecks
â€ș Network performance measurement
â€ș Data transfer node (DTN) design and configuration
» Also important to apply your security policy without impacting performance
» The NSF’s Cyberinfrastructure (CC*) Program funded this model in over 100 US
universities:
â€ș See nsf.gov/funding/pgm_summ.jsp?pims_id=504748
» No current funding equivalent in the UK; it’s down to individual campuses to
fund changes to network architectures for data-intensive science
â€ș But this can and should be part of your network architecture evolution
11/04/2017 Janet end-to-end performance initiative update
Good news – you’re doing a lot already
» There are several examples of sites in the UK that
have a form of Science DMZ deployment
» In many cases the deployments were made
without knowledge of the Science DMZ model
» But Science DMZ is simply a set of good principles
to follow, so it’s not surprising that some Janet
sites are already doing it
» Examples in the UK:
â€ș Diamond Light Source (more from Alex next)
â€ș JASMIN/CEDA DataTransfer Zone
â€ș Imperial College GridPP; supports up to 40Gbit/s
of IPv4/IPv6
â€ș To realise the benefit, both end sites need to
apply the principles
11/04/2017 Janet end-to-end performance initiative update
Examples of campus network engineering
» In principle you can just use your Janet IP service
» Where specific guarantees are required, the Netpath Plus service is available
â€ș See jisc.ac.uk/netpath
â€ș But then you’ll not be able to exceed that capacity
» Some sites split their campus links, e.g. 10G campus, 10G science
â€ș Again, be careful about using your “resilient” link for bulk data
â€ș It’s better to speak to the JSD about a primary link upgrade
â€ș And ensure appropriate resilience for that
» Some sites rate-limit research traffic
» Some examples of physical / virtual overlays
â€ș The WLCG (for LHC data) deployed OPN (optical) and LHCONE (virtual) networks
â€ș Not clear that one overlay per research community would scale
» At least one site is exploringCisco ACI (their SDN solution)
11/04/2017 Janet end-to-end performance initiative update
Measuring network characteristics
» It’s important to have telemetry on your network
» The Science DMZ model recommends perfSONAR for this
â€ș More from Alex and Szymon later in this session
â€ș Current version about to be 4.0 (as of April 17th, all being well )
» perfSONAR uses proven measurement tools under the hood
â€ș e.g. iperf and owamp
» Can run between two perfSONAR systems or build a mesh
» Collects telemetry over time; throughput, loss, latency, traffic path
» Helps you assess the impact of changes to your network or systems
â€ș And to understand variance in characteristics over time
» It can highlight poor performance, but doesn’t troubleshoot per se
11/04/2017 Janet end-to-end performance initiative update
Example: UK GridPP perfSONAR mesh
11/04/2017 Janet end-to-end performance initiative update
Janet perfSONAR / DTN test node(s)
» We’ve installed a 10G perfSONAR node at a Janet PoP in London
â€ș Lets you test your site’s throughput to/from Janet backbone
â€ș Might be useful to you if you know you’ll want to run some data-intensive applications
in the near future, but don’t yet have perfSONAR at the far end, or if you just want to
benchmark your site’s connectivity
â€ș Ask us if you’re interested in using it
» We’re planning to add a second perfSONAR test node in our Slough DC
â€ș Also planning to install a 10G reference SSD-based DTN there
â€ș See https://fasterdata.es.net/science-dmz/DTN/
â€ș This will allow disk-to-disk tests, using a variety of transfer tools
» We can also run a perfSONAR mesh for you, using MaDDash on aVM
» We may also deploy an experimental perfSONAR node
â€ș e.g. to evaluate the new GoogleTCP-BBR implementation
11/04/2017 Janet end-to-end performance initiative update
Small node perfSONAR
» In some cases, just an indicative perfSONAR test is useful
» i.e., run loss/latency tests as normal, but limit throughout tests to 1Gbit/s
» For this scenario, you can build a small node perfSONAR system for under £250
» Jisc took part in the GEANT small node pilot project, using Gigabyte Brix:
â€ș IPv4 and IPv6 test mesh at http://perfsonar-smallnodes.geant.org/maddash-webui/
» We now have a device build that we can offer to communities for testing
â€ș Aim to make them as “plug and play” as possible
â€ș And a stepping stone to a full perfSONAR node
» Further information andTNC2016 meeting slide deck:
â€ș https://lists.geant.org/sympa/d_read/perfsonar-smallnodes/
11/04/2017 Janet end-to-end performance initiative update
perfSONAR small node test mesh
11/04/2017 Janet end-to-end performance initiative update
Aside: Google’sTCP-BBR
» TraditionalTCP performs poorly even with just a fraction of 1% loss rate
» Google have been developing a new version ofTCP
â€ș TCP-BBR was open-sourced last year
â€ș Requires just sender-side deployment
â€ș Seeks high throughput with a small queue
â€ș Good performance at up to 15% loss
â€ș Google using it in production today
» Would be good to explore this further
â€ș Understand impact on otherTCP variants
â€ș And when used for parallelisedTCP applications like GridFTP
» See the presentation from the March 2017 IETF meeting:
â€ș etf.org/proceedings/98/slides/slides-98-iccrg-an-update-on-bbr-congestion-control-
00.pdf
11/04/2017 Janet end-to-end performance initiative update
Building on Science DMZ?
» We should seek to establish good principles and practices at all campuses
â€ș And the research organisations they work with, like Diamond
â€ș There’s already a good foundation at many GridPP sites
» The Janet backbone is heading towards 600G capacity and beyond
» We can seed further communities of good practice on this foundation
â€ș e.g. the DiRAC HPC community, the SES consortium, 

» And grow a Research DataTransfer Zone (RDTZ) within and between campuses
â€ș Build towards a UK RDTZ
â€ș Inspired by the US Pacific Research Platform model of multi-site, multi-discipline
research-driven collaboration built on NSF Science DMZ investment
» Many potential benefits, such as enabling new types of workflow
â€ș e.g. streaming data to CPUs without the need to store locally
11/04/2017 Janet end-to-end performance initiative update
Transfer tools
» Your researchers are likely to find many available data transfer tools:
» There’s the simpler old friends like ftp and scp
â€ș But these are likely to give a bad initial impression of what your network can do
» There’sTCP-based tools designed to mitigate the impact of packet loss
â€ș GridFTP typically uses four parallelTCP streams
» Globus Connect is free for non-profit research and education use
â€ș See globus.org/globus-connect
» There’s tools to support management of transfers
â€ș FTS – see http://astro.dur.ac.uk/~dph0elh/documentation/transfer-data-to-ral-v1.4.pdf
» There’s also a commercial UDP-based option, Aspera
â€ș See asperasoft.com/
» It would be good to establish more benchmarking of these tools at Janet campuses
11/04/2017 Janet end-to-end performance initiative update
E2E performance to cloud compute
» We’re seeing growing interest in the use of commercial cloud compute
â€ș e.g. to provide remote CPU for scientific equipment
» Complements compute available at the new ESPRCTier-2 HPC facilities
â€ș epsrc.ac.uk/research/facilities/hpc/tier2/
» Anecdotal reports of 2-3Gbit/s into AWS
â€ș e.g. by researchers at the Institute of Cancer Research
â€ș See presentations at RCUK CloudWorkshop - https://cloud.ac.uk/
» Bandwidth for AWS depends on theVM size
â€ș See https://aws.amazon.com/ec2/instance-types/
» We’re keen to explore cloud compute connectivity further
â€ș Includes AWS, MS ExpressRoute, 

â€ș And scaling Janet connectivity to these services as appropriate
11/04/2017 Janet end-to-end performance initiative update
Future plans for E2EPI
» Our future plans include:
â€ș Writing up and disseminating best practice case studies
â€ș Growing a UK RDTZ by promoting such best practices within communities
â€ș Deploying a second 10G Janet perfSONAR test node at our Slough DC
â€ș Deploying a 10G reference DTN at Slough and performing transfer tool benchmarking
â€ș Promoting wider campus perfSONAR deployment and community meshes
â€ș Integrating perfSONAR data with router link utilisation
â€ș Developing our troubleshooting support further
â€ș Experimenting with Google’sTCP-BBR
â€ș Exploring best practices in implementing security models for Science DMZ
â€ș Expanding Science DMZ to include IPv6, SDN and other technologies
â€ș Running a second community E2EPI workshop in October
â€ș Holding a hands-on perfSONAR training event (after 4.0 is out)
11/04/2017 Janet end-to-end performance initiative update
Useful links
» Janet E2EPI roject page
â€ș jisc.ac.uk/rd/projects/janet-end-to-end-performance-initiative
» E2EPI Jisc community page
â€ș https://community.jisc.ac.uk/groups/janet-end-end-performance-initiative
» JiscMail E2EPI list (approx 100 subscribers)
â€ș jiscmail.ac.uk/cgi-bin/webadmin?A0=E2EPI
» Camus Network Engineering for Data-Intensive Science workshop slides
â€ș jisc.ac.uk/events/campus-network-engineering-for-data-intensive-science-workshop-
19-oct-2016
» Fasterdata knowledge base
â€ș http://fasterdata.es.net/
» eduPERT knowledge base
â€ș http://kb.pert.geant.net/PERTKB/WebHome
11/04/2017 Janet end-to-end performance initiative update
jisc.ac.uk
contact
Tim Chown
Jisc
tim.chown@jisc.ac.uk
Using perfSONAR and
Science DMZ to resolve
throughput issues
AlexWhite, Diamond
Diamond Light Source
perfSONAR and Science DMZ at Diamond
Diamond
»The Diamond machine is a type of particle accelerator
»CERN: high energy particles smashed together, analyse
the crash
»Diamond: exploits the light produced by high energy
particles undergoing acceleration
»Use this light to study matter – like a “super microscope”
What is the Diamond Light Source?
perfSONAR and Science DMZ at Diamond
Diamond
»In the Oxfordshire countryside, by the A34 near Didcot
»Diamond is a not-for-profit joint venture between STFC
and the WellcomeTrust
»Cost of use: Free access for scientists through a
competitive scientific application process
»Over 7000 researchers from academia and industry have
used our facility
What is the Diamond Light Source?
perfSONAR and Science DMZ at Diamond
The Machine
»Three particle accelerators:
â€ș Linear accelerator
â€ș Booster Synchrotron
â€ș Storage ring
– 48 straight sections angled
together to make a ring
– 562m long
– could be called a
“tetracontakaioctagon”
perfSONAR and Science DMZ at Diamond
The Machine
perfSONAR and Science DMZ at Diamond
The Science
»Bright beams of light from
each bending magnet or
wiggler are directed into
laboratories known as
“beamlines”.
»We also have several cutting-
edge electron microscopes.
»All of these experiments
operate concurrently!
»Spectroscopy
»Crystallography
»Tomography (think CAT Scan)
»Infrared
»X-ray absorption
»X-ray scattering
perfSONAR and Science DMZ at Diamond
x-ray diffraction
perfSONAR and Science DMZ at Diamond
Data Rates
»Central Lustre and GPFS filesystems for science data:
â€ș 420TB, 900TB, 3.3PB (as of 2016)
»Typical x-ray camera: 4MB frame, 100x per second
»An experiment can easily produce 300GB-1TB
»Scientists want to take their data home
Data-intensive research
perfSONAR and Science DMZ at Diamond
Data Rates
»Each dataset might only be downloaded once
»I don’t know where my users are
»Some scientists want to push data back to Diamond
Data-intensive problems
perfSONAR and Science DMZ at Diamond
Sneakernet
The old ways are the best?
perfSONAR and Science DMZ at Diamond
Network Limits
»Science data downloads from Diamond to visiting users’
institutes were inconsistent and slow
â€ș 
even though the facility had a “10Gb/s” JANET
connection from STFC.
»The limit on download speeds was delaying post-
experiment analysis at users’ home institutes.
The Problem
perfSONAR and Science DMZ at Diamond
Characterising the problem
»Our target: “a stable 50Mb/s
over a 10ms path”
â€ș Using our site’s shared
JANET connection
â€ș In real terms, 10ms was
approximately DLS to
Oxford
perfSONAR and Science DMZ at Diamond
Baseline findings
»Inside our network: 10Gb/s, no packet loss
»Over the STFC/JANET segment between Diamond’s edge
and the Physics Department at Oxford:
â€ș low and unpredictable speeds
â€ș a small amount of packet loss
Baseline findings with iperf
perfSONAR and Science DMZ at Diamond
Packet Loss
TCPThroughput =
»MSS (Packet Size)
»Latency (RTT)
»Packet Loss probability
TCP Performance is predicted by the Mathis equation
perfSONAR and Science DMZ at Diamond
Packet Loss
“Interesting” effects of packet loss
perfSONAR and Science DMZ at Diamond
Packet Loss
»According to Mathis, to
achieve our initial goal over a
10ms path, the tolerable
packet loss is:
â€ș 0.026% (maximum)
perfSONAR and Science DMZ at Diamond
Finding the problem in the Last Mile
»We worked with STFC to connect a perfSONAR server
directly to the main Harwell campus border router to look
for loss in the “last mile”
perfSONAR and Science DMZ at Diamond
Science DMZ
The Fix: Science DMZ
perfSONAR and Science DMZ at Diamond
Security
»Data intensive science traffic interacts poorly with
enterprise firewalls
»Does this mean we can use the Science DMZ idea to just

ignore security?
â€ș No!
Aside: Security without Firewalls
perfSONAR and Science DMZ at Diamond
Security
»Implementing a Science DMZ means segmenting your
network traffic
â€ș Apply specific controls to data transfer hosts
â€ș Avoid unnecessary controls
Science DMZ as a Security Architecture
perfSONAR and Science DMZ at Diamond
Security
»Only run the services you need
»Protect each host:
â€ș Router ACLs
â€ș On-host firewall – Linux iptables is performant
Techniques to secure Science DMZ hosts
perfSONAR and Science DMZ at Diamond
Globus GridFTP
perfSONAR and Science DMZ at Diamond
»We adopted Globus GridFTP as
our recommended transfer
method
â€ș It uses parallelTCP streams
â€ș It retries and resumes
automatically
â€ș It has a simple, web-based
interface
Diamond Science DMZ
The Diamond Science DMZ
perfSONAR and Science DMZ at Diamond
Diamond Science DMZ
»Test data: 2Gb/s+ consistently
between Diamond and the
ESnet test point at
Brookhaven Labs, NewYork
State, USA
Speed records!
»Biggest: Electron Microscope
data from DLS to Imperial
â€ș 1120GB at 290Mb/s
»Fastest: Crystallography
dataset from DLS to
Newcastle
â€ș 260GB at 480Mb/s
Actual transfers:
perfSONAR and Science DMZ at Diamond
Bad bits
»Globus’ logs
â€ș Netflow
»Globus install
»My own use of perfSONAR
Bad bits
perfSONAR and Science DMZ at Diamond
SCP and Aspera
»SCP
»Aspera
Different TransferTools?
perfSONAR and Science DMZ at Diamond
Near Future
»10Gb+ performance
â€ș Globus Cluster for multiple 10Gb links
â€ș 40Gb single links
Near Future
perfSONAR and Science DMZ at Diamond
Thank you!
»Use real world testing
»Never believe your vendor
»Zero packet loss is crucial
»Enterprise firewalls introduce
packet loss
AlexWhite
Diamond Light Source
alex.white@diamond.ac.uk
perfSONAR and Science DMZ at Diamond
jisc.ac.uk
contact
AlexWhite
Diamond
alex.white@diamond.ac.uk
Essentials of the Modern
Performance Monitoring with
perfSONAR
SzymonTrocha, PSNC / GÉANT, szymon.trocha@psnc.pl
11 April 2017
Motivations
»Identify problems, when they
happen or (better) earlier
»The tools must be available (at
campus endpoints,
demarcations between
networks, at exchange points,
and near data resources such
as storage and computing
elements, etc)
»Access to testing resources
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
Problem statement
» The global Research & Education network ecosystem is comprised of hundreds of
international, national, regional and local-scale networks
» While these networks all interconnect, each network is owned and operated by separate
organizations (called “domains”) with different policies, customers, funding models,
hardware, bandwidth and configurations
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
» This complex, heterogeneous set of networks must operate seamlessly from “end to end”
to support your science and research collaborations that are distributed globally
Where AreThe (multidomain) Problems?
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
Source
Campus
S
Congested or faulty links
between domains
Congested intra-
campus links
D
Destination
Campus
Latency dependant problems inside
domains with small RTT
Challenges
»Delivering end-to-end performance
â€ș Get the user, service delivery teams, local campus and
metro/backbone network operators working together
effectively
–Have tools in place
–Know your (network) expectations
–Be aware of network troubleshooting
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
What is perfSONAR?
» It’s infeasible to perform at-scale data movement all the time – as we see in other
forms of science, we need to rely on simulations
» perfSONAR is a tool to:
â€ș Set network performance expectations
â€ș Find network problems (“soft failures”)
â€ș Help fix these problems
â€ș All in multi-domain environments
» These problems are all harder when multiple networks are involved
» perfSONAR is provides a standard way to publish monitoring data
» This data is interesting to network researchers as well as network operators
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
TheToolkit
» Network performance comes down to a couple of key metrics:
â€ș Throughput (e.g. “how much can I get out of the network”)
â€ș Latency (time it takes to get to/from a destination)
â€ș Packet loss/duplication/ordering (for some sampling of packets, do they all make it to the other side without serious abnormalities
occurring?)
â€ș Network utilization (the opposite of “throughput” for a moment in time)
» We can get many of these from a selection of measurement tools – the perfSONAR Toolkit
» The “perfSONAR Toolkit” is an open source implementation and packaging of the perfSONAR
measurement infrastructure and protocols
» All components are available as RPMs, DEBs, and bundled as a CentOS ISO
» Very easy to install and configure (usually takes less than 30 minutes for default install)
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
Importance of RegularTesting
» We can’t wait for users to report problems and then fix them
» Things just break sometimes
â€ș Bad system or network tuning
â€ș Failing optics
â€ș Somebody messed around in a patch panel and kinked a fiber
â€ș Hardware goes bad
» Problems that get fixed have a way of coming back
â€ș System defaults come back after hardware/software upgrades
â€ș New employees may not know why the previous employee set things up a certain way and back
out fixes
» Important to continually collect, archive, and alert on active test results
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
perfSONAR Deployment Possibilities
»Small node»Dedicated server
â€ș A singleCPU with multiple cores
(2.7 GHz for 10Gps tests)
â€ș 4GB RAM
â€ș 1Gps onboard (mgmt + delay)
â€ș 10Gps PCI-slot NIC (throughput)
»Low cost – small PC
â€ș e.g. GIGABYTE BRIX GB-BACE-
3150
Various models
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
»Ad hoc testing
â€ș Docker
Deployment Styles
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
Beacon
Island
Mesh
Location criteria
»Where it can be integrated
into the facility
software/hardware
management systems?
»Where it can do the most good
for the network operators or
users?
»Where it can do the most good
for the community?
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
EDGE
NEXTTO
SERVICES
Distributed Deployment In a Few Steps
Choose your
home
‱ Connect to
network
InstallToolkit
software
‱ By site
administrator
Configure
hosts
‱ Networking
‱ 2 interfaces
‱ Visibility
Point to
central host
‱ To consume
central mesh
configuration
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
Install and
configure
‱ By mesh
administrator
‱ Central data
storage
‱ Dashboard GUI
‱ Home for mesh
configuration
Configure
mesh
‱ Who, what and
when
‱ Every 6 hours
(bandwidth)
‱ Be careful 10G ->
1 G
Publish mesh
configuration
‱ To be consumed
by measurement
hosts
Run dashboard
‱ Observe
thresholds
‱ Look for errors
HOSTSCENTRALSERVER
New 4.0 release announcement
» Introduction of pScheduler
â€ș New software for measurements to replace BWCTL
» GUI changes
â€ș More information, better presentation
» OS support upgrade
â€ș Debian 7, Ubuntu 14
â€ș Centos7
– Still supportingCentos6
» Mesh config GUI
» Maddash alerting
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
April 17th
4.0 Data Colletion
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
New GUI
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
Thank you!
szymon.trocha@psnc.pl
12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
‱ http://www.perfsonar.net/
‱ http://docs.perfsonar.net/
‱ http://www.perfsonar.net/about/getting-help/
‱ perfSONAR videos
‱ https://learn.nsrc.org/perfsonar
Hands-on
course in
London is
coming

jisc.ac.uk
SzymonTrocha
PSNC/GEANT
szymon.trocha@man.poznan.pl
12/04/2017 Title of presentation (Insert > Header & Footer > Slide > Footer > Apply to all)

More Related Content

What's hot

UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015Martin Hamilton
 
Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014
Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014
Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014Jisc
 
Archiving data from Durham to RAL using the File Transfer Service (FTS)
Archiving data from Durham to RAL using the File Transfer Service (FTS)Archiving data from Durham to RAL using the File Transfer Service (FTS)
Archiving data from Durham to RAL using the File Transfer Service (FTS)Jisc
 
Science DMZ at Imperial
Science DMZ at ImperialScience DMZ at Imperial
Science DMZ at ImperialJisc
 
Challenges in end-to-end performance
Challenges in end-to-end performanceChallenges in end-to-end performance
Challenges in end-to-end performanceJisc
 
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014Jisc
 
Janet Network R&D Innovation - HEAnet / Juniper Innovation Day
Janet Network R&D Innovation - HEAnet / Juniper Innovation DayJanet Network R&D Innovation - HEAnet / Juniper Innovation Day
Janet Network R&D Innovation - HEAnet / Juniper Innovation DayMartin Hamilton
 
Science DMZ security
Science DMZ securityScience DMZ security
Science DMZ securityJisc
 
perfSONAR: getting telemetry on your network
perfSONAR: getting telemetry on your networkperfSONAR: getting telemetry on your network
perfSONAR: getting telemetry on your networkJisc
 
Imperial RIOXX implementation - Andrew McLean, Imperial College London
Imperial RIOXX implementation - Andrew McLean, Imperial College LondonImperial RIOXX implementation - Andrew McLean, Imperial College London
Imperial RIOXX implementation - Andrew McLean, Imperial College LondonJisc
 
SKA NZ R&D BeSTGRID Infrastructure
SKA NZ R&D BeSTGRID InfrastructureSKA NZ R&D BeSTGRID Infrastructure
SKA NZ R&D BeSTGRID InfrastructureNick Jones
 
Lean approach to IT development
Lean approach to IT developmentLean approach to IT development
Lean approach to IT developmentMark Krebs
 
DSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
DSD-INT 2019 Modelling in DANUBIUS-RI-BellafioreDSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
DSD-INT 2019 Modelling in DANUBIUS-RI-BellafioreDeltares
 
The future of research: are you ready? - Jeremy Frey - Jisc Digital Festival ...
The future of research: are you ready? - Jeremy Frey - Jisc Digital Festival ...The future of research: are you ready? - Jeremy Frey - Jisc Digital Festival ...
The future of research: are you ready? - Jeremy Frey - Jisc Digital Festival ...Jisc
 
Creating a Climate for Innovation on Internet2 - Eric Boyd Senior Director, S...
Creating a Climate for Innovation on Internet2 - Eric Boyd Senior Director, S...Creating a Climate for Innovation on Internet2 - Eric Boyd Senior Director, S...
Creating a Climate for Innovation on Internet2 - Eric Boyd Senior Director, S...Ed Dodds
 
Dev ops, noops or hypeops - Networkshop44
Dev ops, noops or hypeops -  Networkshop44Dev ops, noops or hypeops -  Networkshop44
Dev ops, noops or hypeops - Networkshop44Jisc
 
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...CINECAProject
 
Application of Assent in the safe - Networkshop44
Application of Assent in the safe -  Networkshop44Application of Assent in the safe -  Networkshop44
Application of Assent in the safe - Networkshop44Jisc
 
Research data spring: streamlining deposit
Research data spring: streamlining depositResearch data spring: streamlining deposit
Research data spring: streamlining depositJisc RDM
 

What's hot (20)

UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
UK e-Infrastructure for Research - UK/USA HPC Workshop, Oxford, July 2015
 
Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014
Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014
Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014
 
Archiving data from Durham to RAL using the File Transfer Service (FTS)
Archiving data from Durham to RAL using the File Transfer Service (FTS)Archiving data from Durham to RAL using the File Transfer Service (FTS)
Archiving data from Durham to RAL using the File Transfer Service (FTS)
 
Science DMZ at Imperial
Science DMZ at ImperialScience DMZ at Imperial
Science DMZ at Imperial
 
Challenges in end-to-end performance
Challenges in end-to-end performanceChallenges in end-to-end performance
Challenges in end-to-end performance
 
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
 
Janet Network R&D Innovation - HEAnet / Juniper Innovation Day
Janet Network R&D Innovation - HEAnet / Juniper Innovation DayJanet Network R&D Innovation - HEAnet / Juniper Innovation Day
Janet Network R&D Innovation - HEAnet / Juniper Innovation Day
 
Science DMZ security
Science DMZ securityScience DMZ security
Science DMZ security
 
perfSONAR: getting telemetry on your network
perfSONAR: getting telemetry on your networkperfSONAR: getting telemetry on your network
perfSONAR: getting telemetry on your network
 
Imperial RIOXX implementation - Andrew McLean, Imperial College London
Imperial RIOXX implementation - Andrew McLean, Imperial College LondonImperial RIOXX implementation - Andrew McLean, Imperial College London
Imperial RIOXX implementation - Andrew McLean, Imperial College London
 
SKA NZ R&D BeSTGRID Infrastructure
SKA NZ R&D BeSTGRID InfrastructureSKA NZ R&D BeSTGRID Infrastructure
SKA NZ R&D BeSTGRID Infrastructure
 
Lean approach to IT development
Lean approach to IT developmentLean approach to IT development
Lean approach to IT development
 
DSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
DSD-INT 2019 Modelling in DANUBIUS-RI-BellafioreDSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
DSD-INT 2019 Modelling in DANUBIUS-RI-Bellafiore
 
The future of research: are you ready? - Jeremy Frey - Jisc Digital Festival ...
The future of research: are you ready? - Jeremy Frey - Jisc Digital Festival ...The future of research: are you ready? - Jeremy Frey - Jisc Digital Festival ...
The future of research: are you ready? - Jeremy Frey - Jisc Digital Festival ...
 
The Environmental Futures & Big Data Impact Lab: Plymouth Launch Event Slides
The Environmental Futures & Big Data Impact Lab: Plymouth Launch Event SlidesThe Environmental Futures & Big Data Impact Lab: Plymouth Launch Event Slides
The Environmental Futures & Big Data Impact Lab: Plymouth Launch Event Slides
 
Creating a Climate for Innovation on Internet2 - Eric Boyd Senior Director, S...
Creating a Climate for Innovation on Internet2 - Eric Boyd Senior Director, S...Creating a Climate for Innovation on Internet2 - Eric Boyd Senior Director, S...
Creating a Climate for Innovation on Internet2 - Eric Boyd Senior Director, S...
 
Dev ops, noops or hypeops - Networkshop44
Dev ops, noops or hypeops -  Networkshop44Dev ops, noops or hypeops -  Networkshop44
Dev ops, noops or hypeops - Networkshop44
 
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
 
Application of Assent in the safe - Networkshop44
Application of Assent in the safe -  Networkshop44Application of Assent in the safe -  Networkshop44
Application of Assent in the safe - Networkshop44
 
Research data spring: streamlining deposit
Research data spring: streamlining depositResearch data spring: streamlining deposit
Research data spring: streamlining deposit
 

Similar to Parallel session: supporting data-intensive applications

Future services on Janet
Future services on JanetFuture services on Janet
Future services on JanetJisc
 
Network Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingNetwork Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingGlobus
 
Provisioning Janet
Provisioning JanetProvisioning Janet
Provisioning JanetJisc
 
Future services on Janet
Future services on JanetFuture services on Janet
Future services on JanetJisc
 
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...SURFnet
 
Data Mobility Exhibition
Data Mobility ExhibitionData Mobility Exhibition
Data Mobility ExhibitionGlobus
 
Linac Coherent Light Source (LCLS) Data Transfer Requirements
Linac Coherent Light Source (LCLS) Data Transfer RequirementsLinac Coherent Light Source (LCLS) Data Transfer Requirements
Linac Coherent Light Source (LCLS) Data Transfer Requirementsinside-BigData.com
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research PlatformLarry Smarr
 
Shared services - the future of HPC and big data facilities for UK research
Shared services - the future of HPC and big data facilities for UK researchShared services - the future of HPC and big data facilities for UK research
Shared services - the future of HPC and big data facilities for UK researchMartin Hamilton
 
RNP 5th J-PAS 11-Nov-2012
RNP 5th J-PAS 11-Nov-2012RNP 5th J-PAS 11-Nov-2012
RNP 5th J-PAS 11-Nov-2012Alex Moura
 
Tutorial: Maximizing Performance and Network Utility with a Science DMZ
Tutorial: Maximizing Performance and Network Utility with a Science DMZTutorial: Maximizing Performance and Network Utility with a Science DMZ
Tutorial: Maximizing Performance and Network Utility with a Science DMZGlobus
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsTal Lavian Ph.D.
 
Built around answering questions
Built around answering questionsBuilt around answering questions
Built around answering questionsLarry Smarr
 
Networking Challenges for the Next Decade
Networking Challenges for the Next DecadeNetworking Challenges for the Next Decade
Networking Challenges for the Next DecadeOpen Networking Summit
 
Common Design Elements for Data Movement Eli Dart
Common Design Elements for Data Movement Eli DartCommon Design Elements for Data Movement Eli Dart
Common Design Elements for Data Movement Eli DartEd Dodds
 
Big Data Analytics and Advanced Computer Networking Scenarios
Big Data Analytics and Advanced Computer Networking ScenariosBig Data Analytics and Advanced Computer Networking Scenarios
Big Data Analytics and Advanced Computer Networking ScenariosStenio Fernandes
 
Rpi talk foster september 2011
Rpi talk foster september 2011Rpi talk foster september 2011
Rpi talk foster september 2011Ian Foster
 
Janet Futures
Janet FuturesJanet Futures
Janet FuturesJisc
 
Convergence of Machine Learning, Big Data and Supercomputing
Convergence of Machine Learning, Big Data and SupercomputingConvergence of Machine Learning, Big Data and Supercomputing
Convergence of Machine Learning, Big Data and SupercomputingDESMOND YUEN
 

Similar to Parallel session: supporting data-intensive applications (20)

Future services on Janet
Future services on JanetFuture services on Janet
Future services on Janet
 
Network Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingNetwork Engineering for High Speed Data Sharing
Network Engineering for High Speed Data Sharing
 
Provisioning Janet
Provisioning JanetProvisioning Janet
Provisioning Janet
 
Future services on Janet
Future services on JanetFuture services on Janet
Future services on Janet
 
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
 
Data Mobility Exhibition
Data Mobility ExhibitionData Mobility Exhibition
Data Mobility Exhibition
 
Linac Coherent Light Source (LCLS) Data Transfer Requirements
Linac Coherent Light Source (LCLS) Data Transfer RequirementsLinac Coherent Light Source (LCLS) Data Transfer Requirements
Linac Coherent Light Source (LCLS) Data Transfer Requirements
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
Shared services - the future of HPC and big data facilities for UK research
Shared services - the future of HPC and big data facilities for UK researchShared services - the future of HPC and big data facilities for UK research
Shared services - the future of HPC and big data facilities for UK research
 
RNP 5th J-PAS 11-Nov-2012
RNP 5th J-PAS 11-Nov-2012RNP 5th J-PAS 11-Nov-2012
RNP 5th J-PAS 11-Nov-2012
 
Tutorial: Maximizing Performance and Network Utility with a Science DMZ
Tutorial: Maximizing Performance and Network Utility with a Science DMZTutorial: Maximizing Performance and Network Utility with a Science DMZ
Tutorial: Maximizing Performance and Network Utility with a Science DMZ
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
 
Built around answering questions
Built around answering questionsBuilt around answering questions
Built around answering questions
 
Grid computing
Grid computingGrid computing
Grid computing
 
Networking Challenges for the Next Decade
Networking Challenges for the Next DecadeNetworking Challenges for the Next Decade
Networking Challenges for the Next Decade
 
Common Design Elements for Data Movement Eli Dart
Common Design Elements for Data Movement Eli DartCommon Design Elements for Data Movement Eli Dart
Common Design Elements for Data Movement Eli Dart
 
Big Data Analytics and Advanced Computer Networking Scenarios
Big Data Analytics and Advanced Computer Networking ScenariosBig Data Analytics and Advanced Computer Networking Scenarios
Big Data Analytics and Advanced Computer Networking Scenarios
 
Rpi talk foster september 2011
Rpi talk foster september 2011Rpi talk foster september 2011
Rpi talk foster september 2011
 
Janet Futures
Janet FuturesJanet Futures
Janet Futures
 
Convergence of Machine Learning, Big Data and Supercomputing
Convergence of Machine Learning, Big Data and SupercomputingConvergence of Machine Learning, Big Data and Supercomputing
Convergence of Machine Learning, Big Data and Supercomputing
 

More from Jisc

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...Jisc
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxJisc
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxJisc
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Jisc
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...Jisc
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptxJisc
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxJisc
 
The Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxThe Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxJisc
 
Are we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxJisc
 
JiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJisc
 
UWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxJisc
 
An introduction to Cyber Essentials
An introduction to Cyber EssentialsAn introduction to Cyber Essentials
An introduction to Cyber EssentialsJisc
 
MarkChilds.pptx
MarkChilds.pptxMarkChilds.pptx
MarkChilds.pptxJisc
 
RStrachanOct23.pptx
RStrachanOct23.pptxRStrachanOct23.pptx
RStrachanOct23.pptxJisc
 
ISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxJisc
 
FerrellWalker.pptx
FerrellWalker.pptxFerrellWalker.pptx
FerrellWalker.pptxJisc
 

More from Jisc (20)

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptx
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptx
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptx
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptx
 
The Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxThe Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptx
 
Are we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptx
 
JiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptx
 
UWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptx
 
An introduction to Cyber Essentials
An introduction to Cyber EssentialsAn introduction to Cyber Essentials
An introduction to Cyber Essentials
 
MarkChilds.pptx
MarkChilds.pptxMarkChilds.pptx
MarkChilds.pptx
 
RStrachanOct23.pptx
RStrachanOct23.pptxRStrachanOct23.pptx
RStrachanOct23.pptx
 
ISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptx
 
FerrellWalker.pptx
FerrellWalker.pptxFerrellWalker.pptx
FerrellWalker.pptx
 

Recently uploaded

Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžcall girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 

Recently uploaded (20)

Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžcall girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 

Parallel session: supporting data-intensive applications

  • 1. Parallel Session A: Supporting data intensive applications Chair:Tim Chown
  • 2. Please switch your mobile phones to silent 17:30 - 19:00 No fire alarms scheduled. In the event of an alarm, please follow directions of NCC staff Exhibitor showcase and drinks reception 18:00 - 19:00 Birds of a feather sessions
  • 4. Why is end-to-end performance important? » Overall goal: help optimise a site’s use of its Janet connectivity » Seeing a growth in data-intensive science applications â€ș Includes established areas like GridPP (moving LHC data around) â€ș As well as new areas like cryo-electron microscopy » Seeing an increasing number of remote computation scenarios â€ș e.g., scientific networked equipment, no local compute â€ș Might require 10Gbit/s to remote compute to return computation results on a 100GB data set to a researcher for timely visualisation » Starting to see more 100Gbit/s connectivity requests â€ș Likely to have challenging data transfer requirements behind them » As networking people, how do we help our researchers and their applications? 11/04/2017 Janet end-to-end performance initiative update
  • 5. Speaking to your researchers » Are you or your computing service department speaking to your researchers? â€ș If not, how do you understand their data-intensive requirements? â€ș If so, is this happening on a regular basis? â€ș Ideally you’d want to be able to plan ahead, rather than adapt on the fly » Do you conduct networking “future looks”? â€ș Any step changes might dwarf your site’s organic growth â€ș This issue should have some attention at CIO or PVC Research level » Do you know what your application elephants are? â€ș What’s the breakdown of your site’s network traffic? â€ș How are you monitoring network flows? 11/04/2017 Janet end-to-end performance initiative update
  • 6. Researcher expectations » How can we help set and manage researcher expectations? » One aspect is helping them articulate their network requirements â€ș Volume of data in time X => data rate required » It’s also about understanding practical limitations â€ș Can determine theoretical network throughput â€ș e.g. in principle, you can transfer 100TB over a 10Gbit/s link in 1 day â€ș But in practice, many factors may prevent this » We should encourage researchers to speak to their computing service » Computing services can in turn talk to the Janet Service Desk â€ș jisc.ac.uk/contact » And noting here that cloud capabilities are becoming increasingly important 11/04/2017 Janet end-to-end performance initiative update
  • 7. Janet end-to-end performance initiative » This is the context in which Jisc set up the Janet end-to-end performance initiative » The goals of the initiative include: â€ș Engaging with existing data-intensive research communities and identifying emerging communities â€ș Creating dialogue between Jisc, computing service groups, and research communities â€ș Holding workshops, facilitating discussion on e-mail lists, etc. â€ș Helping researchers manage expectations â€ș Establishing and sharing best practices in identifying and rectifying causes of poor performance » More information: â€ș jisc.ac.uk/rd/projects/janet-end-to-end-performance-initiative 11/04/2017 Janet end-to-end performance initiative update
  • 8. Understanding the factors affecting E2E » Achieving optimal end-to-end performance is a multi-faceted problem. » It includes: â€ș Appropriate provisioning between the end sites â€ș Properties of the local campus network (at each end), including capacity of the Janet connectivity, internal LAN design, the performance of firewalls and the configuration of other devices on the path â€ș End system configuration and tuning; network stack buffer sizes, disk I/O, memory management, etc. â€ș The choice of tools used to transfer data, and the underlying network protocols » To optimise end-to-end performance, you need to address each aspect 11/04/2017 Janet end-to-end performance initiative update
  • 9. Janet network engineering » From Jisc’s perspective, it’s important to ensure there is sufficient capacity in the network for its connected member sites » We perform an ongoing review of network utilisation â€ș Provision the backbone network and regional links â€ș Provision external connectivity, to other NRENs and networks â€ș Observe the utilisation, model growth, predict ahead â€ș Step changes have bigger impact at the local rather than backbone scale » Janet has no differential queueing for regular IP traffic â€ș The Netpath service exists for dedicated / overlay links â€ș In general, Jisc plans regular network upgrades with a view to ensuring that there is sufficient latent capacity in the network 11/04/2017 Janet end-to-end performance initiative update
  • 10. Janet backbone, October 2016 11/04/2017 Janet end-to-end performance initiative update
  • 11. Major external links, October 2016 11/04/2017 Janet end-to-end performance initiative update
  • 12. E2EPI site visits » We (Duncan Rand and I) have visited a dozen or so sites â€ș Met with a variety of networking staff and researchers â€ș And spoken to many others via email » Really interesting to hear what sites are doing â€ș Some good practice evident, especially in local network engineering â€ș e.g. routing those elephants around main campus firewalls â€ș Campus firewalls often not designed for single high throughput flows » Seeing varying use of site links â€ș e.g. 10G for campus, 10G resilient, 20G for research (GridPP) â€ș Some sites using their “resilient” link for bulk data transfers » Some rate limiting of researcher traffic â€ș To avoid adverse impact on general campus traffic â€ș We’d encourage sites to talk to the JSD rather than rate limit 11/04/2017 Janet end-to-end performance initiative update
  • 13. The Science DMZ model » ESnet published the Science DMZ “design pattern” in 2012/13 â€ș es.net/assets/pubs_presos/sc13sciDMZ-final.pdf » Three key elements: â€ș Network architecture; avoiding local bottlenecks â€ș Network performance measurement â€ș Data transfer node (DTN) design and configuration » Also important to apply your security policy without impacting performance » The NSF’s Cyberinfrastructure (CC*) Program funded this model in over 100 US universities: â€ș See nsf.gov/funding/pgm_summ.jsp?pims_id=504748 » No current funding equivalent in the UK; it’s down to individual campuses to fund changes to network architectures for data-intensive science â€ș But this can and should be part of your network architecture evolution 11/04/2017 Janet end-to-end performance initiative update
  • 14. Good news – you’re doing a lot already » There are several examples of sites in the UK that have a form of Science DMZ deployment » In many cases the deployments were made without knowledge of the Science DMZ model » But Science DMZ is simply a set of good principles to follow, so it’s not surprising that some Janet sites are already doing it » Examples in the UK: â€ș Diamond Light Source (more from Alex next) â€ș JASMIN/CEDA DataTransfer Zone â€ș Imperial College GridPP; supports up to 40Gbit/s of IPv4/IPv6 â€ș To realise the benefit, both end sites need to apply the principles 11/04/2017 Janet end-to-end performance initiative update
  • 15. Examples of campus network engineering » In principle you can just use your Janet IP service » Where specific guarantees are required, the Netpath Plus service is available â€ș See jisc.ac.uk/netpath â€ș But then you’ll not be able to exceed that capacity » Some sites split their campus links, e.g. 10G campus, 10G science â€ș Again, be careful about using your “resilient” link for bulk data â€ș It’s better to speak to the JSD about a primary link upgrade â€ș And ensure appropriate resilience for that » Some sites rate-limit research traffic » Some examples of physical / virtual overlays â€ș The WLCG (for LHC data) deployed OPN (optical) and LHCONE (virtual) networks â€ș Not clear that one overlay per research community would scale » At least one site is exploringCisco ACI (their SDN solution) 11/04/2017 Janet end-to-end performance initiative update
  • 16. Measuring network characteristics » It’s important to have telemetry on your network » The Science DMZ model recommends perfSONAR for this â€ș More from Alex and Szymon later in this session â€ș Current version about to be 4.0 (as of April 17th, all being well ) » perfSONAR uses proven measurement tools under the hood â€ș e.g. iperf and owamp » Can run between two perfSONAR systems or build a mesh » Collects telemetry over time; throughput, loss, latency, traffic path » Helps you assess the impact of changes to your network or systems â€ș And to understand variance in characteristics over time » It can highlight poor performance, but doesn’t troubleshoot per se 11/04/2017 Janet end-to-end performance initiative update
  • 17. Example: UK GridPP perfSONAR mesh 11/04/2017 Janet end-to-end performance initiative update
  • 18. Janet perfSONAR / DTN test node(s) » We’ve installed a 10G perfSONAR node at a Janet PoP in London â€ș Lets you test your site’s throughput to/from Janet backbone â€ș Might be useful to you if you know you’ll want to run some data-intensive applications in the near future, but don’t yet have perfSONAR at the far end, or if you just want to benchmark your site’s connectivity â€ș Ask us if you’re interested in using it » We’re planning to add a second perfSONAR test node in our Slough DC â€ș Also planning to install a 10G reference SSD-based DTN there â€ș See https://fasterdata.es.net/science-dmz/DTN/ â€ș This will allow disk-to-disk tests, using a variety of transfer tools » We can also run a perfSONAR mesh for you, using MaDDash on aVM » We may also deploy an experimental perfSONAR node â€ș e.g. to evaluate the new GoogleTCP-BBR implementation 11/04/2017 Janet end-to-end performance initiative update
  • 19. Small node perfSONAR » In some cases, just an indicative perfSONAR test is useful » i.e., run loss/latency tests as normal, but limit throughout tests to 1Gbit/s » For this scenario, you can build a small node perfSONAR system for under ÂŁ250 » Jisc took part in the GEANT small node pilot project, using Gigabyte Brix: â€ș IPv4 and IPv6 test mesh at http://perfsonar-smallnodes.geant.org/maddash-webui/ » We now have a device build that we can offer to communities for testing â€ș Aim to make them as “plug and play” as possible â€ș And a stepping stone to a full perfSONAR node » Further information andTNC2016 meeting slide deck: â€ș https://lists.geant.org/sympa/d_read/perfsonar-smallnodes/ 11/04/2017 Janet end-to-end performance initiative update
  • 20. perfSONAR small node test mesh 11/04/2017 Janet end-to-end performance initiative update
  • 21. Aside: Google’sTCP-BBR » TraditionalTCP performs poorly even with just a fraction of 1% loss rate » Google have been developing a new version ofTCP â€ș TCP-BBR was open-sourced last year â€ș Requires just sender-side deployment â€ș Seeks high throughput with a small queue â€ș Good performance at up to 15% loss â€ș Google using it in production today » Would be good to explore this further â€ș Understand impact on otherTCP variants â€ș And when used for parallelisedTCP applications like GridFTP » See the presentation from the March 2017 IETF meeting: â€ș etf.org/proceedings/98/slides/slides-98-iccrg-an-update-on-bbr-congestion-control- 00.pdf 11/04/2017 Janet end-to-end performance initiative update
  • 22. Building on Science DMZ? » We should seek to establish good principles and practices at all campuses â€ș And the research organisations they work with, like Diamond â€ș There’s already a good foundation at many GridPP sites » The Janet backbone is heading towards 600G capacity and beyond » We can seed further communities of good practice on this foundation â€ș e.g. the DiRAC HPC community, the SES consortium, 
 » And grow a Research DataTransfer Zone (RDTZ) within and between campuses â€ș Build towards a UK RDTZ â€ș Inspired by the US Pacific Research Platform model of multi-site, multi-discipline research-driven collaboration built on NSF Science DMZ investment » Many potential benefits, such as enabling new types of workflow â€ș e.g. streaming data to CPUs without the need to store locally 11/04/2017 Janet end-to-end performance initiative update
  • 23. Transfer tools » Your researchers are likely to find many available data transfer tools: » There’s the simpler old friends like ftp and scp â€ș But these are likely to give a bad initial impression of what your network can do » There’sTCP-based tools designed to mitigate the impact of packet loss â€ș GridFTP typically uses four parallelTCP streams » Globus Connect is free for non-profit research and education use â€ș See globus.org/globus-connect » There’s tools to support management of transfers â€ș FTS – see http://astro.dur.ac.uk/~dph0elh/documentation/transfer-data-to-ral-v1.4.pdf » There’s also a commercial UDP-based option, Aspera â€ș See asperasoft.com/ » It would be good to establish more benchmarking of these tools at Janet campuses 11/04/2017 Janet end-to-end performance initiative update
  • 24. E2E performance to cloud compute » We’re seeing growing interest in the use of commercial cloud compute â€ș e.g. to provide remote CPU for scientific equipment » Complements compute available at the new ESPRCTier-2 HPC facilities â€ș epsrc.ac.uk/research/facilities/hpc/tier2/ » Anecdotal reports of 2-3Gbit/s into AWS â€ș e.g. by researchers at the Institute of Cancer Research â€ș See presentations at RCUK CloudWorkshop - https://cloud.ac.uk/ » Bandwidth for AWS depends on theVM size â€ș See https://aws.amazon.com/ec2/instance-types/ » We’re keen to explore cloud compute connectivity further â€ș Includes AWS, MS ExpressRoute, 
 â€ș And scaling Janet connectivity to these services as appropriate 11/04/2017 Janet end-to-end performance initiative update
  • 25. Future plans for E2EPI » Our future plans include: â€ș Writing up and disseminating best practice case studies â€ș Growing a UK RDTZ by promoting such best practices within communities â€ș Deploying a second 10G Janet perfSONAR test node at our Slough DC â€ș Deploying a 10G reference DTN at Slough and performing transfer tool benchmarking â€ș Promoting wider campus perfSONAR deployment and community meshes â€ș Integrating perfSONAR data with router link utilisation â€ș Developing our troubleshooting support further â€ș Experimenting with Google’sTCP-BBR â€ș Exploring best practices in implementing security models for Science DMZ â€ș Expanding Science DMZ to include IPv6, SDN and other technologies â€ș Running a second community E2EPI workshop in October â€ș Holding a hands-on perfSONAR training event (after 4.0 is out) 11/04/2017 Janet end-to-end performance initiative update
  • 26. Useful links » Janet E2EPI roject page â€ș jisc.ac.uk/rd/projects/janet-end-to-end-performance-initiative » E2EPI Jisc community page â€ș https://community.jisc.ac.uk/groups/janet-end-end-performance-initiative » JiscMail E2EPI list (approx 100 subscribers) â€ș jiscmail.ac.uk/cgi-bin/webadmin?A0=E2EPI » Camus Network Engineering for Data-Intensive Science workshop slides â€ș jisc.ac.uk/events/campus-network-engineering-for-data-intensive-science-workshop- 19-oct-2016 » Fasterdata knowledge base â€ș http://fasterdata.es.net/ » eduPERT knowledge base â€ș http://kb.pert.geant.net/PERTKB/WebHome 11/04/2017 Janet end-to-end performance initiative update
  • 28. Using perfSONAR and Science DMZ to resolve throughput issues AlexWhite, Diamond
  • 29. Diamond Light Source perfSONAR and Science DMZ at Diamond
  • 30. Diamond »The Diamond machine is a type of particle accelerator »CERN: high energy particles smashed together, analyse the crash »Diamond: exploits the light produced by high energy particles undergoing acceleration »Use this light to study matter – like a “super microscope” What is the Diamond Light Source? perfSONAR and Science DMZ at Diamond
  • 31. Diamond »In the Oxfordshire countryside, by the A34 near Didcot »Diamond is a not-for-profit joint venture between STFC and the WellcomeTrust »Cost of use: Free access for scientists through a competitive scientific application process »Over 7000 researchers from academia and industry have used our facility What is the Diamond Light Source? perfSONAR and Science DMZ at Diamond
  • 32. The Machine »Three particle accelerators: â€ș Linear accelerator â€ș Booster Synchrotron â€ș Storage ring – 48 straight sections angled together to make a ring – 562m long – could be called a “tetracontakaioctagon” perfSONAR and Science DMZ at Diamond
  • 33. The Machine perfSONAR and Science DMZ at Diamond
  • 34. The Science »Bright beams of light from each bending magnet or wiggler are directed into laboratories known as “beamlines”. »We also have several cutting- edge electron microscopes. »All of these experiments operate concurrently! »Spectroscopy »Crystallography »Tomography (think CAT Scan) »Infrared »X-ray absorption »X-ray scattering perfSONAR and Science DMZ at Diamond
  • 35. x-ray diffraction perfSONAR and Science DMZ at Diamond
  • 36. Data Rates »Central Lustre and GPFS filesystems for science data: â€ș 420TB, 900TB, 3.3PB (as of 2016) »Typical x-ray camera: 4MB frame, 100x per second »An experiment can easily produce 300GB-1TB »Scientists want to take their data home Data-intensive research perfSONAR and Science DMZ at Diamond
  • 37. Data Rates »Each dataset might only be downloaded once »I don’t know where my users are »Some scientists want to push data back to Diamond Data-intensive problems perfSONAR and Science DMZ at Diamond
  • 38. Sneakernet The old ways are the best? perfSONAR and Science DMZ at Diamond
  • 39. Network Limits »Science data downloads from Diamond to visiting users’ institutes were inconsistent and slow â€ș 
even though the facility had a “10Gb/s” JANET connection from STFC. »The limit on download speeds was delaying post- experiment analysis at users’ home institutes. The Problem perfSONAR and Science DMZ at Diamond
  • 40. Characterising the problem »Our target: “a stable 50Mb/s over a 10ms path” â€ș Using our site’s shared JANET connection â€ș In real terms, 10ms was approximately DLS to Oxford perfSONAR and Science DMZ at Diamond
  • 41. Baseline findings »Inside our network: 10Gb/s, no packet loss »Over the STFC/JANET segment between Diamond’s edge and the Physics Department at Oxford: â€ș low and unpredictable speeds â€ș a small amount of packet loss Baseline findings with iperf perfSONAR and Science DMZ at Diamond
  • 42. Packet Loss TCPThroughput = »MSS (Packet Size) »Latency (RTT) »Packet Loss probability TCP Performance is predicted by the Mathis equation perfSONAR and Science DMZ at Diamond
  • 43. Packet Loss “Interesting” effects of packet loss perfSONAR and Science DMZ at Diamond
  • 44. Packet Loss »According to Mathis, to achieve our initial goal over a 10ms path, the tolerable packet loss is: â€ș 0.026% (maximum) perfSONAR and Science DMZ at Diamond
  • 45. Finding the problem in the Last Mile »We worked with STFC to connect a perfSONAR server directly to the main Harwell campus border router to look for loss in the “last mile” perfSONAR and Science DMZ at Diamond
  • 46. Science DMZ The Fix: Science DMZ perfSONAR and Science DMZ at Diamond
  • 47. Security »Data intensive science traffic interacts poorly with enterprise firewalls »Does this mean we can use the Science DMZ idea to just
 ignore security? â€ș No! Aside: Security without Firewalls perfSONAR and Science DMZ at Diamond
  • 48. Security »Implementing a Science DMZ means segmenting your network traffic â€ș Apply specific controls to data transfer hosts â€ș Avoid unnecessary controls Science DMZ as a Security Architecture perfSONAR and Science DMZ at Diamond
  • 49. Security »Only run the services you need »Protect each host: â€ș Router ACLs â€ș On-host firewall – Linux iptables is performant Techniques to secure Science DMZ hosts perfSONAR and Science DMZ at Diamond
  • 50. Globus GridFTP perfSONAR and Science DMZ at Diamond »We adopted Globus GridFTP as our recommended transfer method â€ș It uses parallelTCP streams â€ș It retries and resumes automatically â€ș It has a simple, web-based interface
  • 51. Diamond Science DMZ The Diamond Science DMZ perfSONAR and Science DMZ at Diamond
  • 52. Diamond Science DMZ »Test data: 2Gb/s+ consistently between Diamond and the ESnet test point at Brookhaven Labs, NewYork State, USA Speed records! »Biggest: Electron Microscope data from DLS to Imperial â€ș 1120GB at 290Mb/s »Fastest: Crystallography dataset from DLS to Newcastle â€ș 260GB at 480Mb/s Actual transfers: perfSONAR and Science DMZ at Diamond
  • 53. Bad bits »Globus’ logs â€ș Netflow »Globus install »My own use of perfSONAR Bad bits perfSONAR and Science DMZ at Diamond
  • 54. SCP and Aspera »SCP »Aspera Different TransferTools? perfSONAR and Science DMZ at Diamond
  • 55. Near Future »10Gb+ performance â€ș Globus Cluster for multiple 10Gb links â€ș 40Gb single links Near Future perfSONAR and Science DMZ at Diamond
  • 56. Thank you! »Use real world testing »Never believe your vendor »Zero packet loss is crucial »Enterprise firewalls introduce packet loss AlexWhite Diamond Light Source alex.white@diamond.ac.uk perfSONAR and Science DMZ at Diamond
  • 58. Essentials of the Modern Performance Monitoring with perfSONAR SzymonTrocha, PSNC / GÉANT, szymon.trocha@psnc.pl 11 April 2017
  • 59. Motivations »Identify problems, when they happen or (better) earlier »The tools must be available (at campus endpoints, demarcations between networks, at exchange points, and near data resources such as storage and computing elements, etc) »Access to testing resources 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
  • 60. Problem statement » The global Research & Education network ecosystem is comprised of hundreds of international, national, regional and local-scale networks » While these networks all interconnect, each network is owned and operated by separate organizations (called “domains”) with different policies, customers, funding models, hardware, bandwidth and configurations 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR » This complex, heterogeneous set of networks must operate seamlessly from “end to end” to support your science and research collaborations that are distributed globally
  • 61. Where AreThe (multidomain) Problems? 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR Source Campus S Congested or faulty links between domains Congested intra- campus links D Destination Campus Latency dependant problems inside domains with small RTT
  • 62. Challenges »Delivering end-to-end performance â€ș Get the user, service delivery teams, local campus and metro/backbone network operators working together effectively –Have tools in place –Know your (network) expectations –Be aware of network troubleshooting 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
  • 63. What is perfSONAR? » It’s infeasible to perform at-scale data movement all the time – as we see in other forms of science, we need to rely on simulations » perfSONAR is a tool to: â€ș Set network performance expectations â€ș Find network problems (“soft failures”) â€ș Help fix these problems â€ș All in multi-domain environments » These problems are all harder when multiple networks are involved » perfSONAR is provides a standard way to publish monitoring data » This data is interesting to network researchers as well as network operators 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
  • 64. TheToolkit » Network performance comes down to a couple of key metrics: â€ș Throughput (e.g. “how much can I get out of the network”) â€ș Latency (time it takes to get to/from a destination) â€ș Packet loss/duplication/ordering (for some sampling of packets, do they all make it to the other side without serious abnormalities occurring?) â€ș Network utilization (the opposite of “throughput” for a moment in time) » We can get many of these from a selection of measurement tools – the perfSONAR Toolkit » The “perfSONAR Toolkit” is an open source implementation and packaging of the perfSONAR measurement infrastructure and protocols » All components are available as RPMs, DEBs, and bundled as a CentOS ISO » Very easy to install and configure (usually takes less than 30 minutes for default install) 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
  • 65. Importance of RegularTesting » We can’t wait for users to report problems and then fix them » Things just break sometimes â€ș Bad system or network tuning â€ș Failing optics â€ș Somebody messed around in a patch panel and kinked a fiber â€ș Hardware goes bad » Problems that get fixed have a way of coming back â€ș System defaults come back after hardware/software upgrades â€ș New employees may not know why the previous employee set things up a certain way and back out fixes » Important to continually collect, archive, and alert on active test results 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
  • 66. perfSONAR Deployment Possibilities »Small node»Dedicated server â€ș A singleCPU with multiple cores (2.7 GHz for 10Gps tests) â€ș 4GB RAM â€ș 1Gps onboard (mgmt + delay) â€ș 10Gps PCI-slot NIC (throughput) »Low cost – small PC â€ș e.g. GIGABYTE BRIX GB-BACE- 3150 Various models 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR »Ad hoc testing â€ș Docker
  • 67. Deployment Styles 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR Beacon Island Mesh
  • 68. Location criteria »Where it can be integrated into the facility software/hardware management systems? »Where it can do the most good for the network operators or users? »Where it can do the most good for the community? 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR EDGE NEXTTO SERVICES
  • 69. Distributed Deployment In a Few Steps Choose your home ‱ Connect to network InstallToolkit software ‱ By site administrator Configure hosts ‱ Networking ‱ 2 interfaces ‱ Visibility Point to central host ‱ To consume central mesh configuration 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR Install and configure ‱ By mesh administrator ‱ Central data storage ‱ Dashboard GUI ‱ Home for mesh configuration Configure mesh ‱ Who, what and when ‱ Every 6 hours (bandwidth) ‱ Be careful 10G -> 1 G Publish mesh configuration ‱ To be consumed by measurement hosts Run dashboard ‱ Observe thresholds ‱ Look for errors HOSTSCENTRALSERVER
  • 70. New 4.0 release announcement » Introduction of pScheduler â€ș New software for measurements to replace BWCTL » GUI changes â€ș More information, better presentation » OS support upgrade â€ș Debian 7, Ubuntu 14 â€ș Centos7 – Still supportingCentos6 » Mesh config GUI » Maddash alerting 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR April 17th
  • 71. 4.0 Data Colletion 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
  • 72. New GUI 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR
  • 73. Thank you! szymon.trocha@psnc.pl 12/04/2017 Essentials of the Modern Performance Monitoring with perfSONAR ‱ http://www.perfsonar.net/ ‱ http://docs.perfsonar.net/ ‱ http://www.perfsonar.net/about/getting-help/ ‱ perfSONAR videos ‱ https://learn.nsrc.org/perfsonar Hands-on course in London is coming

  • 74. jisc.ac.uk SzymonTrocha PSNC/GEANT szymon.trocha@man.poznan.pl 12/04/2017 Title of presentation (Insert > Header & Footer > Slide > Footer > Apply to all)

Editor's Notes

  1. Each dataset might only be downloaded once Users might only want a small section of their data – over the course of an hour of data collection, perhaps only the last five minutes were useful Does not make sense to push every byte to a CDN I don’t know where my users are Some organisations have federated agreements with partner institutions –dark fibre, or VPNs over the Internet. I however have users performing ad-hoc access from anywhere.s I can’t block segments of the world – we have users from Russia, China etc Some scientists want to push data back to Diamond Reprocessing of experiments with new techniques New facilities located outside Diamond using our software analysis stack
  2. When I started at Diamond, this is how users moved large amounts of data offsite. It’s a dedicated machine called a “data dispenser” it’s actually a member of the Lustre and GPFS parallel filesystems it speaks USB3 It’s an embarassment
  3. We started testing Diamond’s own network by using cheap servers with perfSONAR, to run AD-HOC tests. First we checked them back-to-back that they could do 10Gb, then we tested deep inside the Science network next to the storage servers at the edge of the Diamond network, where we link to STFC Then we used perfSONAR’s built-in community list to find a third server just up the road at the University of Oxford to test against
  4. Packet loss? Is that a problem? In my experience, on the local area, no, it's never concerned me before. This was the attitude of my predecessors also.
  5. There are three major factors that affect TCP performance. All three are interrelated. The “Mathis” equation predicts the maximum transfer speed of a TCP link. TCP has the concept of a “Window Size”, which describes the amount of data that can be in flight in a TCP connection between ACK packets. The sending system will send a TCP window worth of data, and then wait for an acknowledgement from the receiver. At the start of a transfer, TCP begins with a small window size, which it ramps up in size as more and more data is successfully sent. You’ve heard of this before; this is just Bandwidth Delay product calculations for TCP Performance over Long Fat Networks. However, BDP calculations for a Long Fat network assume a lossless path, and in the real world, we don’t have lossless paths. When a TCP stream encounters packet loss, it has to recover. To do this, TCP rapidly decreases the window size for the amount of data in flight. This sounds obvious, but the effect of spurious packet loss on TCP is pretty profound.
  6. With everything being equal, the time for a TCP connection to recover from loss goes up as the latency goes up.
  7. The packet loss between Diamond and Oxford was too small to previously cause concern to network engineers, but was found to be large enough to disrupt high-speed transfers over distances greater than a few miles.
  8. So... once we had an idea what the problem was, instead of just complaining at them, I could take actual fault data to my upstream supplier. Phil at STFC was gracious in working with us to allow us to connect up another PerfSonar server. Testing with this server let us run traffic just through the firewall, which showed us that the packet loss problem was with the firewall, and not with the JANET connection between us and Oxford
  9. “Science DMZ” (an idea from ESnet) is otherwise known as “moving your data transfer server to the edge of your network” Your JANET connection has high latency, but hopefully no loss. If your local LAN has an amount of packet loss, you might still see reasonable speed over it because the latency on the LAN is so small. The Science DMZ model puts your application server between the long and short latency parts, bridging them. Essentially, you’re splitting the latency domains.
  10. This section is cribbed from an utterly excellent presentation by Kate Petersen Mace from Esnet Science DMZ is about reducing degrees of freedom and removing the number of network devices in the critical path. Removing redundant devices, methods of entry etc, is a positive enhancement to security.
  11. The typical corporate LAN is a wild-west filled with devices that you didn’t do the engineering on yourself: Network printers Web browsers Financial databases In my case, modern oscilloscopes running Windows 95
 That cool touch-screen display kiosk in reception Windows devices In comparison to the corporate LAN, the Science DMZ is a calm, relaxed place. You know precisely what services you’re moving to the data transfer hosts. You’re typically only exposing one service to the Internet – FTP for example, or Aspera. You leave the enterprise firewall to do its job – that of preventing access to myriad unknown ports, and you use a more appropriate tool to secure each side.
  12. Linux iptables will happily do port ACLs on the host at 10Gb with no real hit on the CPU.
  13. This means that if a packet gets dropped in one stream, the other streams carry on the transfer while the affected one recovers.
  14. This is how we implemented the Science DMZ at diamond You can see the Data Transfer node, connected directly to the campus site front door routers There is a key question on how do you get your data to your Data Transfer node? Move data to it on a hard drive Route your data through your lossy firewall front door – it’s low latency, so Mathis says this should still be quick We went a step further and put in a private, non-routable fibre link to a server in the datacentre. The internal server has an important second function – Authentication of access to data. I didn’t want to put an AD server in the Science DMZ if I didn’t need to, so this internal server is running the Authentication server for Globus Online from inside our corporate network
  15. I tend to use perfSONAR just for ad-hoc testing. It’s fabulous for this, and I can direct my users to it for self-service testing of their network connections. I know it is very useful for periodic network monitoring, but I don’t take advantage of that yet
  16. The common implementation of SCP has a fixed TCP window size. It will never grow big enough to fill a fast link. Aspera – UDP based system, commercial. never used it, I’ve heard some people from various corners of science talk about it, but I’ve never been asked for it. I’m sure it would do well on the Science DMZ if we ever need Aspera.
  17. - globus online default stream splitting -> per file or chunks?