Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A reid ands_ttt2_perth_network-literacy 17_may18

63 views

Published on

Data handling and network literacy - presented by A Reid at the Accelerate your Data training event in Perth 17 May 2018

Published in: Education
  • Be the first to comment

  • Be the first to like this

A reid ands_ttt2_perth_network-literacy 17_may18

  1. 1. Data Handling & Network Literacy Data movement and network know-how A train the trainer workshop
  2. 2. 2 Talking Points © AARNet Pty Ltd | ● Advanced research networks ● Network services ● Researcher data movement problems
  3. 3. 3 Talking Points © AARNet Pty Ltd | ● Advanced research networks ● Network services ● Researcher data movement problems
  4. 4. 4 Typical Researcher Data File Sizes © AARNet Pty Ltd | Gigabyte (GB) files: 4GB: DVD movie. 5GB: modest USB stick. 20GB: Blu-ray movie. 100GB: 4K movie. 300GB: laptop backup. 500GB: high- end USB stick. Megabyte (MB) files: 1MB: scholarly paper. 2MB: e-Book. 3MB: HD photo. 5MB: 1 song (MP3). 10MB: 1-minute Youtube movie. 100MB: album of songs. 750MB: CD-ROM. Kilobyte (KB) files: 1KB: email. 4KB: page of text. 30KB: page of formatted text. 25KB: 1-page spreadsheet. 40KB: simple Web page. Terabyte (TB) files: 2TB: laptop backup storage HDD. 5TB: typical Humanities data collection. 20TB: typical Medical data collection. 100TB: typical Genomics data collection. Petabyte (PB) files: 2PB: all Medical data collections on RDS. 2PB: all genomic data collections on RDS. 10PB: Climate data. 20PB: ASKAP data per year. 1000PB (1EB): SKA data per year.
  5. 5. 5 Research Data Services (RDS) Collections https://www.rds.edu.au/
  6. 6. 6 Data Collection Sizes (Research Data Services RDS) © AARNet Pty Ltd | 90 data collections = 2200TB (average = 24TB/collection) 49 data collections = 359TB (average = 7TB/collection) 31 data collections = 2237TB (average = 72TB/collection) See RDS Research Community Projects
  7. 7. 1TB Transfer See Tech of the Internets File Transfer Time Calculator: https://techinternets.com/copy_calc
  8. 8. 8 Australian NREN © AARNet Pty Ltd | ● Advanced research network infrastructure ● Fast - 10 Gbps > 40 Gbps > 100 Gbps ● High capacity - 1 million + users ● Tailored for research, teaching and learning ● Low latency - consistent connectivity and response time ● Designed to have “head room”
  9. 9. National Network
  10. 10. International Network
  11. 11. Network Speed 11 Australian NREN © AARNet Pty Ltd | Download 10 Gbps 40 Gbps 100 Gbps Upload 10 Gbps 40 Gbps 100 Gbps
  12. 12. 12 Australian NBN © AARNet Pty Ltd | Download 100 Mbps 1 Gbps Upload 40 Mbps 400 Mbps Network Speed See NBN Fact Sheet: Traffic Class 4 for data https://www.nbnco.com.au/cont ent/dam/nbnco2/documents/nbn -business-fact-sheets/nbn- business-fact-sheet-tc4.pdf
  13. 13. 1TB Transfer See Tech of the Internets File Transfer Time Calculator: https://techinternets.com/copy_calc
  14. 14. WHAT CAN PREVENT THROUGHPUT = BANDWIDTH? Need to Know True End-End Network Connectivity & Characteristics. • Contention/Congestion, Firewalls, PC Speed, Applications, … • For very large file transfers, buffer sizes and other network tuning might be required. • Some simple things researchers can do themselves to try to get a handle on file transfer issues. ● Speedtest 🚀 ● Ping 🔊 ● Traceroute 🔎 14© AARNet Pty Ltd | Exercises
  15. 15. 15 Speed Test 🚀 © AARNet Pty Ltd | http://www.speedtest.net/ Test on wifi and a mobile phone Pay attention to: upload vs download speeds
  16. 16. 16 Bandwidth & Throughput © AARNet Pty Ltd | Network Bandwidth (bps) measures the maximum speed of the data that can be transferred. Network Throughput (bps) measures the actual speed of transfer across an end- end network.
  17. 17. 17 Ping 🔊 © AARNet Pty Ltd | Type “ping” and the web address: eg ping aarnet.edu.au See WikiHow https://m.wikihow.com/Ping-an-IP- Address Windows: open up the Command Prompt using “cmd”, type ping aarnet.edu.au Linux: open a terminal window, type ping aarnet.edu.au (press CTRL and C to stop the command) Mac: open up Network Utilities (using Spotlight) and select ping menu and type in aarnet.edu.au Identifying issues with end to end data transfers
  18. 18. 18 Traceroute 🔎 © AARNet Pty Ltd | Type “tracert” or “traceroute” and the web address e.g. tracert aarnet.edu.au Windows: open up the Command Prompt using “cmd”, type tracert aarnet.edu.au Linux: open a terminal window, type traceroute aarnet.edu.au (press CTRL and C to stop the command) Mac: open up Network Utilities (using Spotlight) and select traceroute menu and type in aarnet.edu.au Identifying failure point of data transfers
  19. 19. 19 Latency © AARNet Pty Ltd | • Normally, determined by the distance from one end to the other. • Also can be affected by the speed of switches through which the signal travels. • Signals only move along wires/fibres at ~half the speed of light. • Satellite connections have very long latency because of the distances involved (2x36,000km). • Latency is important for real-time applications, like videoconferencing or gaming. • Use Ping to measure latency on a particular route.
  20. 20. 20 Elephants © AARNet Pty Ltd | CC-BY 2.0 Brian Ralphs
  21. 21. 21 Elephant Flows © AARNet Pty Ltd | Spot the large blocks of traffic (elephants) moving in and out of the network.
  22. 22. 22 Network Architecture - Current State © AARNet Pty Ltd | International Uni B Uni B Uni A International Uni A University Campus Email Finance Student Research Etc ... Campus Data 100GbpsAARNet NREN FIREWALL 1-10Gbps
  23. 23. 23 Science DMZ © AARNet Pty Ltd | Network Connection tuned for large scientific / research data traffic: • Irregular, but very disruptive. • Optimised for big data science, elephant flows. • Science DMZ (“demilitarised zone”) diverts large data flows from/to specific sites (eg 10TB of climate data that NCI imports from the USA every week). • Improves overall network performance – for regular users as well as researchers. • Reduces need to upgrade corporate Firewalls. • Developed by ESnet in the USA.
  24. 24. 24 Network Architecture - Future State (Science DMZ) © AARNet Pty Ltd | AARNet NREN International Uni B Uni B Uni A International Uni A Email Finance Student etc ... Campus Data FIREWALL Campus RESEARCH DATA University Campus Science DMZ switch
  25. 25. 25 Talking Points © AARNet Pty Ltd | ● Advanced research networks ● Network services ● Researcher data movement problems
  26. 26. 26 Network Services ● CloudStor ● FileSender ● Zoom ● Zoom Webinars ● Discipline-specific Virtual Labs ● eduroam WiFi international roaming ● etc
  27. 27. 27© AARNet Pty Ltd |
  28. 28. 28© AARNet Pty Ltd |
  29. 29. CloudStor Functions - Store - Upload - Share - Send - Sync - Secure - Package - Describe CloudStor with other Applications - WebDAV e.g. Cyberduck/Transmit - FileSender API - S3 gateways e.g. AWS/Azure - Rocket - Jupyter Notebook - Kaltura - Storage services CloudStor in Research - Move data - Change data - Store data - Describe data - Share data Manual and automated data workflows that support data intensive research 29© AARNet Pty Ltd |
  30. 30. CloudStor: Upload When? All through the research lifecycle. How? Browser, Sync, WebDAV, Rocket, FileSender API, S3 What else? File number, sizes and types, equipment, and programming. 30© AARNet Pty Ltd |
  31. 31. Syncing data ● Just a gentle reminder with Syncing data. ● To think a moment before you delete. ● Sometimes storage points for data capture are also data clearing points. ● Think about developing a workflow to copy files to another folder if data is being moved from the point of capture to the point of processing. ● On the bright side, if you Whoops! delete your data, it’s in a deleted folder for 30 days. 31© AARNet Pty Ltd |
  32. 32. CloudStor: Share & Send When? All through the research lifecycle. How? Group Allocation, FileSender. What else? Institutional allocation, vouchers, and notifications. 32© AARNet Pty Ltd |
  33. 33. Sending or Sharing data? ● Send lasts for 2 weeks ● has file encryption, ● and can be two-way (vouchers). ● Share includes access control (CRUD) ● and utilises web links. 33© AARNet Pty Ltd |
  34. 34. CloudStor: Secure When? All through the research lifecycle. How? Group Drive, FileSender, Backup. What else? User access, public/private links and controls, and encryption. 34© AARNet Pty Ltd |
  35. 35. Securing data ● Files need to reside on Cloudstor for 24 hours to enter the backup cycle. ● Encryption at rest (automatic) and in transit of data (automatic). ● FileSender supports end-end file encryption (in beta). 35© AARNet Pty Ltd |
  36. 36. 36 Talking Points © AARNet Pty Ltd | ● Advanced research networks ● Network services ● Researcher data movement problems
  37. 37. 37 Researcher data movement problems File too big to attach to Email: Solution: Send using FileSender. Slow data transferring via desktop/laptop: Solution: Identify issue location (Ping, Traceroute) & set up direct data transfers via a tailored network (eg Science DMZ). Putting sensitive data at risk by sharing via email, dropbox: Solution: Send sensitive data via encryption on a network (eg FileSender). Too many files or chunky files 100TB (eg 25,000 characterisation images or 333 videos): Solution: Use FileSender group file transfer facility; or use FileSender API to connect direct with the Application generating/capturing the files. etc…
  38. 38. THANK YOU Alex Reid, eResearch Advisor, AARNet. alex.reid@aarnet.edu.au support@aarnet.edu.au

×