Understanding Globus Data
Transfers with NetSage
Doug Southworth
dsouthworth@tacc.utexas.edu
TACC / UT Austin
GlobusWorld
May 8, 2024
NSF #2137603
Monitoring using NetSage
• NetSage advanced measurement services for R&E data traffic
• Better understanding of current traffic patterns across
instrumented circuits
• Better understanding of large flow sources/sinks
• Performance information for data transfers
• Originally a NSF funded collaboration between Indiana
University, LBNL, and University Hawaii Manoa
• Now primary development at TACC
• Homepage: https://netsage.io
2
NetSage Focuses on Use Case Questions
• What are the top sites sharing data with my org?
• What average performance are they experiencing?
• What are the tasks between these two orgs?
• What science projects are transferring data to my site?
NetSage shows patterns of behavior for data
transfers between institutions
3
NetSage Deployments to Date
• Originally developed to support NSF International
circuits (IRNC)
• Still supported by International Networks at Indiana Univ.
• Deployed by over 10 international teams
• For example - https://aponet.netsage.global
• Domestic regional networks supported by EPOC
• Six regional deployments
• About 4,000 unique visitors yearly
4
NetSage Architecture (Flow/SNMP)
5
← At your site | At TACC →
NetSage Architecture (Globus)
6
← At your site | At TACC →
Globus
Data
From
Any-
where
Global Science Registry
• Maps flows to specific resources
• DTN
• Instrument
• Compute
• Contributions come from resource
owners
• Currently matched by IP, however
packet marking is in the roadmap
for development
7
NetSage Privacy
• NetSage is committed to privacy, and preemptively addressing
any security or data sharing concerns
• No PII collected
• Remove the last octet from IP address
• Data Privacy Policy
• https://tinyurl.com/netsage-privacy
• Prototypes are behind a password until we're told to make it
public
8
Live Example: tacc.netsage.io
9
10
Pick a sensor
Pick a timeframe
Links in tables for more
specific information
Top Pairs - For Readers
11
Top Pairs - For Visualizers
12
Top Talkers - Change Over Time
13
14
Detailed Individual Task Information:
15
Similar subnets,
different performance
Is this expected?
Takeaways
• NetSage can help answer questions about data movement
between sites
• Useful resource to understand how data is moving in various
R&E communities
• NetSage for Globus example:
• https://tacc.netsage.io
• More information
• Jennifer Schopf, Doug Southworth
• jms@tacc.utexas.edu, dsouthworth@tacc.utexas.edu
16
GlobusWorld
May 8, 2024
NSF #2137603

Understanding Globus Data Transfers with NetSage

  • 1.
    Understanding Globus Data Transferswith NetSage Doug Southworth dsouthworth@tacc.utexas.edu TACC / UT Austin GlobusWorld May 8, 2024 NSF #2137603
  • 2.
    Monitoring using NetSage •NetSage advanced measurement services for R&E data traffic • Better understanding of current traffic patterns across instrumented circuits • Better understanding of large flow sources/sinks • Performance information for data transfers • Originally a NSF funded collaboration between Indiana University, LBNL, and University Hawaii Manoa • Now primary development at TACC • Homepage: https://netsage.io 2
  • 3.
    NetSage Focuses onUse Case Questions • What are the top sites sharing data with my org? • What average performance are they experiencing? • What are the tasks between these two orgs? • What science projects are transferring data to my site? NetSage shows patterns of behavior for data transfers between institutions 3
  • 4.
    NetSage Deployments toDate • Originally developed to support NSF International circuits (IRNC) • Still supported by International Networks at Indiana Univ. • Deployed by over 10 international teams • For example - https://aponet.netsage.global • Domestic regional networks supported by EPOC • Six regional deployments • About 4,000 unique visitors yearly 4
  • 5.
    NetSage Architecture (Flow/SNMP) 5 ←At your site | At TACC →
  • 6.
    NetSage Architecture (Globus) 6 ←At your site | At TACC → Globus Data From Any- where
  • 7.
    Global Science Registry •Maps flows to specific resources • DTN • Instrument • Compute • Contributions come from resource owners • Currently matched by IP, however packet marking is in the roadmap for development 7
  • 8.
    NetSage Privacy • NetSageis committed to privacy, and preemptively addressing any security or data sharing concerns • No PII collected • Remove the last octet from IP address • Data Privacy Policy • https://tinyurl.com/netsage-privacy • Prototypes are behind a password until we're told to make it public 8
  • 9.
  • 10.
    10 Pick a sensor Picka timeframe Links in tables for more specific information
  • 11.
    Top Pairs -For Readers 11
  • 12.
    Top Pairs -For Visualizers 12
  • 13.
    Top Talkers -Change Over Time 13
  • 14.
  • 15.
  • 16.
    Takeaways • NetSage canhelp answer questions about data movement between sites • Useful resource to understand how data is moving in various R&E communities • NetSage for Globus example: • https://tacc.netsage.io • More information • Jennifer Schopf, Doug Southworth • jms@tacc.utexas.edu, dsouthworth@tacc.utexas.edu 16 GlobusWorld May 8, 2024 NSF #2137603