Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ICT Infrastructure in Support of Data Sharing


Published on

How can NRENs better support data sharing?

Published in: Data & Analytics

ICT Infrastructure in Support of Data Sharing

  1. 1. The African Open Science Platform ICT Infrastructure in Support of Data Sharing Presented by Ina Smith Project Manager African Open Science Platform Academy of Science of South Africa (ASSAf) WACREN 2018 Conference, 15 March 2018
  2. 2. Data Driven World
  3. 3. Square Kilometre Array (SKA) • Data collection on a massive scale • Telescope array to consist of 250,000 radio antennas between Australia & SA • Investment in machine learning and artificial intelligence software tools to enable data analysis • 400+ engineers and technicians in infrastructure, fibre optics, data collection • Supercomputers to process data (IBM) • To come: super computer 3x times power of world’s current fastest computer (Tianhe-2) to cope with SKA data
  4. 4. “Construction of the SKA is due to begin in 2018 and finish sometime in the middle of the next decade. Data acquisition will begin in 2020, requiring a level of processing power and data management know-how that outstretches current capabilities. Astronomers estimate that the project will generate 35,000-DVDs-worth of data every second. This is equivalent to “the whole world wide web every day,” said Fanaroff.”
  5. 5. H3ABioNet (H3Africa) 30 institutions, 15 African countries, 2 partners outside Africa
  6. 6. • African genomic research; Central node at University of Cape Town • Using NetMap to monitor connectivity • Data transfer: Africa Globus Online (668,622 files transferred between Rhodes University & UCT; 140TB data transferred from USA to SA • Challenges: slow & unstable Internet, unreliable power supply, continent-wide obsolete computer infrastructure that varies between medium-scale server infrastructure to a small number of workstations, with multiple operating systems, lack of centralized, secure data storage • Other: database of participants (H3APRDB, REDCap), data analysis incl. Galaxy, Job Management System, eBiokits, REDCap, WebProtege, Pipelines for data execution, data repository (European Genome-Phenome Archive)
  7. 7. Open Science Defined “Open Science is the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods.” - FOSTER Project, funded by the European Commission
  8. 8. Benefits of open data • Provide evidence for research conducted • Collaboration advances science, discovery • Predict trends & informed decisions • Drive development, service delivery • More entrepreneurs – using data in innovative ways, create jobs • Have potentially far more outcomes when open, higher impact • Democratising research & data towards achieving 2030 Sustainable Development Goals
  9. 9. Open Data, Open Science & Research Lifecycle (Foster)
  10. 10. Original Research Data Lifecycle image from University of California, Santa Cruz Repositories Repositories Tools Gold/Green OA Plan Policy&Infrastructure
  11. 11. “Several open science activities are underway across Africa, but a great deal will be gained if, in the context of developing inter-regional links, these activities were to be coordinated and developed through such a coordinating initiative.” - CODATA
  12. 12. Open Data Repositories (re3data - 16)
  13. 13.
  14. 14.
  15. 15. African Open Science Platform • Platform = opportunity to engage in dialogue, create awareness, connect all, provide continental view • Funded by SA Dept. of Science & Technology through National Research Foundation • 3 years (1 Nov. 2016 – 31 Oct. 2019) • Managed by Academy of Science of South Africa (ASSAf) • Through ASSAf hosting ICSU Regional Office for Africa (ICSU ROA) • Direction from CODATA
  16. 16. Accord on Open Data in a Big Data World • Proposes comprehensive set of principles • FAIR Principles • Values of open data in emerging scientific culture of big data • Need for an international framework • Provides framework & plan for African data science capacity mobilization initiativeCall to Endorse
  17. 17. Key Stakeholders • Global Network of Science Academies (IAP) • International Council for Science (ICSU) • The World Academy of Sciences (TWAS) • Research Data Alliance (RDA) • NRENs (Internet Service Providers for Education) • Association of African Universities (AAU) • Network of African Science Academies (NASAC) • African Research Councils (incl. DIRISA, funders) • African Universities • African Governments • Other
  18. 18. Database Experts & Initiatives 800+
  19. 19. Landscape Survey: Countries & Initiatives 567
  20. 20. Concentration of Activities
  21. 21. Click to view Initiatives/Country Please note: this is just a preview and data still to be cleaned and updated and corrected.
  22. 22. AOSP Focus Areas Policy Infrastructure Capacity Building Incentives
  23. 23. Infrastructure Framework • Purpose: Create awareness & guide development of a cyber- infrastructure strategy & action plan, promote policies & strategies • NRENs – Level 6 Elaborated Service Offering An NREN Capability Maturity Model – Duncan Greaves (2015, Tertiary Education Network) • Richly connected at high speed to many other networks/resources • Deep culture of collaboration
  24. 24. Proposed NREN Service Catalogue in support of Data • Grid & cloud computing resources/middleware – access: • Scientific applications, complex data sets, computing facilities • User controlled light paths, videoconferencing, federated identity services, security, data storage and archives, connecting e-resources e.g. electron & astronomical microscopes, medical imaging, simulators, sensor networks, accelerators, supercomputers, state-of-the-art affordable bandwidth on demand, computing power, capacity building, dedicated point-to-point Internet Protocol circuits, data storage (data centres)
  25. 25. • Disciplines: Engineering, IT, Economics, Physics, Biology, Environmental Studies, Public Health, Town Planning (Smart Cities), Population Studies • Research Areas: Climate change, environmental impact, extreme weather events, biodiversity, food security, malaria, infectious diseases and pandemics
  26. 26. Data in Africa • Tunisian Computing Centre el Khawarizmi manages Data Centre • Kenya Education Network (KENET) provides access to domain names, data center, cloud computing & science gateways, capacity building, security services • Data Intensive Research Initiative for South Africa (DIRISA) – component of SA National Cyber- Infrastructure System • Open Data for Africa platform (African Development Bank (AfDB)) – to boost access to quality data for managing & monitoring development results in African countries, incl. African Action Plan 2063 & 2030 SDGs
  27. 27. • High Performance Computing (HPCs): Botswana, Lesotho, Mozambique, SA, Tanzania, Zambia, Zimbabwe • South Africa: Data Intensive Research Cloud Infrastructure Initiatives – ARC, SADIRC, Ilifu (cloud for researchers working in astronomy and bioinformatics in Western Cape & research data management system)
  28. 28. Africa Data Consensus Study • Adopted in March 2015 at High Level Conference on Data Revolution • Strategy for implementing data revolution in Africa • Plan of action to be guided by United Nations Economic Commission for Africa (UNECA), African Union Commission (AUC), African Development Bank (AfDB), supported by UN Development Programme (UNDP), UN Populations Fund (UNFPA) • Implemented in collaboration with partner institutions from public & private sectors, civil society organisations
  29. 29. • Towards strategy and action plan, implementation plan and governance structure • Support strategic plans on Science, Technology, Innovation • Guide on creating and enabling environment to harness science, technology and innovation • Impact socio-economic development & industrialization • Enhance education in developing & using technologies • Support collaborative research development & innovation SADC Cyber-Infrastructure Framework
  30. 30. • Cyber-infrastructure is a key driver for a knowledge based economy • Comprises of technologies, skills, people and policies which support generation, analysis, transport, sharing, stewardship of information (incl. data) • Framework provides Roadmap towards Cyber-infrastructure Strategy
  31. 31. Services offered by UbuntuNet NRENs [Source: Colin Wright SADC/ET-ST1/1/2016/11 Document]
  32. 32. Components • Research and Education Networks (RENs) • Computation resources & services (HPC etc) • Data – tools & facilities to enable efficient data driven discoveries, technologies, innovations • HR-capacity development to enable: • CI specialists to roll-out services & infrastructure • Beneficiaries to fully benefit from CI services • Policies to enable optimum establishment & utilization of CI
  33. 33. Closing Remarks • Exploit data for the benefit of society (Min Naledi Pandor) • Collaboration in research is key, based on reliable infrastructure & high speed connectivity • Increasing need for data gathering, transmission, analysis on a massive scale • Infrastructure Frameworks to be adopted, developed in support of data sharing, research collaboration • NRENs important key stakeholder to make collaboration, sharing of data possible • Build capacity within NRENs
  34. 34. Thank you! Ina Smith