Data management: expose, preserve, protect


Published on

Presentation to a miniworkshop by Ian Moore at the ILRI Annual Program Meeting, Addis Ababa, 16 April 2010

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Methodology for the effective management of information and data throughout its useful life
  • Quality - Explicit data validation procedures are used and documented throughout the research process. Efficiency - Data management procedures are designed to improve the efficiency of the research process. Exposure - Research data are in the public domain unless restrictions will serve our mission better (as per IP policy). The key concern here is the balance the rights of the data originator (i.e. getting credit for what he/she has done) with duty of the institution (i.e. getting the most value from the data, no matter who collected them). As a general rule we allow researchers exclusive use of it for up to 18 months. Preservation - Data generated from research activities are secured in long-term archives. It is possible to find what data have been generated and archived. Potential users are able to retrieve data in a usable format, subject to restrictions in Principle 3. Understandability - Sufficient information is linked to the data for potential users to decide if it meets their requirements. Ethics -. Confidentiality of data on human subjects is respected in accordance with national data protection acts (where they exist) and well established international standards.
  • The Triple-A Framework developed by the ICT-KM Program seeks to help CGIAR Centers/Programs and their scientists decide on the level of Availability, Accessibility and Applicability (AAA) they want for their research outputs, and also the pathways with which to turn these outputs into International Public Goods.
  • Isabelle to talk about folder structure on Central Storage Bruno to talk about SLP, Google Apps and CGXchange
  • Ask audience if they are ready for a disaster happening in their Centers – a disaster that would preclude them from using their critical IT systems support or even operating in their current location. It is a question that should guide every participant during the BCP workshop. The end objective of the workshop is really to increase awareness about the need of BCP among CGIAR key management staff.
  • Level 1 : a relatively small disaster that will not involve a change in location but will require data or resources to be restored. This type of disaster is usually covered by ICT Services through the incident management procedures. Level 2 : a more substantial disaster that affects most of the services hosted in the server room. A decision will need to be taken whether to set up a temporary server room on campus or at the disaster recovery site at ICRAF. Level 3 : a significant disaster that prevents services being provided from the main campus. This will require temporary services to be restored at the disaster recovery site at ICRAF. Level 4 : a major disaster that will involve relocation of the Centre and its staff to an alternative location in a different country
  • Strategy 1 – Low criticality to the operation of the Centre. Original data is stored on individual computers. A copy of the data is taken at scheduled times to local online storage. This protects against the risk 1. Strategy 2 – Medium criticality to the operation of the Centre. Original data is stored on individual computers, on servers or online locally and is usually owned by individuals, shared by groups, classed as Institutional information or application data. A copy of the data is replicated to the online remote storage. This protects against the risk 1 and 2. Strategy 3 – Medium criticality to the operation of the Centre. The original data is archived data that does not change and is stored online locally. A copy of the data can be made to transportable media and stored in a 3 rd location. This protects against risk 1, 2 and 3. Strategy 4 – High criticality to the operation of the Centre. The original data is stored on servers, online locally and is usually owned by individuals, shared by groups, classed as Institutional information or application data. One or more copies of the data are made online to the remote location. Copies of the data are also made to transportable media and stored in a 3 rd location. The frequency of the copies made and the length of time they will be kept for will be negotiated with the owner of the data. This protects against risks 1, 2, 3 and 4.
  • Data management: expose, preserve, protect

    1. 1. Data Management: Expose, Preserve, Protect Ian Moore Head, ICT ILRI & WorldAgroforestry Mini-workshop ILRIAPM2010
    2. 2. Objective of the Mini-Workshop <ul><li>Raise Awareness </li></ul><ul><ul><li>What’s being done? </li></ul></ul><ul><ul><li>Who can help? </li></ul></ul><ul><li>Capture Ideas </li></ul><ul><ul><li>What’s important? </li></ul></ul><ul><ul><li>How can we do more? </li></ul></ul>
    3. 3. Agenda for the Mini-Workshop <ul><li>Principles </li></ul><ul><li>Managing data – What scientists are doing </li></ul><ul><li>Exposure – How KMIS can Help </li></ul><ul><li>Preservation – How RMG can Help </li></ul><ul><li>Protection – How ICT can Help </li></ul>
    4. 4. Information Lifecycle <ul><li>nnn </li></ul><ul><li>Create / Capture / Receipt </li></ul><ul><ul><li>Analyse / Manage / Share / Use </li></ul></ul><ul><ul><li>Retain / Dispose / Preserve </li></ul></ul>
    5. 5. Research Data Management and Archiving Policy <ul><li>Principles: </li></ul><ul><li>Quality </li></ul><ul><li>Efficiency </li></ul><ul><li>Understandability </li></ul><ul><li>Exposure </li></ul><ul><li>Preservation </li></ul><ul><li>Ethics </li></ul>
    6. 6. CGIAR Research Outputs <ul><li>AAA Framework: </li></ul><ul><li>Availability </li></ul><ul><li>Accessibility </li></ul><ul><li>Applicability </li></ul><ul><li> </li></ul>
    7. 7. Managing Data What are scientists doing?
    8. 8. Exposure How KMIS can Help
    9. 9. Preservation How RMG Help
    10. 10. Data Access Livestock: the good, the bad and the …
    11. 11. What is data access? <ul><li>Data access  typically refers to software and activities related to storing, retrieving, or acting on data housed in a database or other repository. </li></ul><ul><li>Data access is done in two stages, </li></ul><ul><ul><ul><li>Physical data access – location, access rights </li></ul></ul></ul><ul><ul><ul><li>Logical data access - ability to understand the data which is facilitated by the metadata </li></ul></ul></ul>
    12. 12. What is metadata? <ul><li>data about data… </li></ul><ul><li>A structured description of characteristics such as the meaning, content, structure & purpose of a resource </li></ul><ul><li>A gateway to resource discovery by enabling field-based searches </li></ul><ul><ul><li>Within the coding for a resource itself. </li></ul></ul><ul><ul><li>In a database of descriptions/repository of resources </li></ul></ul>Where is metadata found?
    13. 13. Examples of International Metadata Standards <ul><li>Dublin core </li></ul><ul><li>Data Documentation Initiative, DDI </li></ul><ul><li>IEEE Draft Standard for Learning Object Metadata </li></ul><ul><li>SCORM - Sharable Content Object Reference Model </li></ul><ul><li>Other bodies involved: Prometeus & CEN/ISSS, ISO, BSI </li></ul>
    14. 14. Importance of metadata <ul><li>Provides a standardized system to classify and label web content </li></ul><ul><li>Improves search relevancy </li></ul><ul><li>Provides an audit trail </li></ul><ul><li>Identification of redundant, duplicative, and obsolete content </li></ul><ul><li>Tracking of institution-wide assembled information </li></ul>
    15. 15. Metadata benefits – to data users <ul><li>Enables searching, retrieval, and evaluation of data set information both within and outside organizations </li></ul><ul><li>Finding data: determine which data exist for a specific area of interest </li></ul><ul><li>Access and Transfer: acquisition, utilization and access rights </li></ul>
    16. 16. Metadata benefits – to organizations <ul><li>Organize and maintain an organization's investment in data </li></ul><ul><li>Documentation of data processing steps, quality control, definitions, data uses and restrictions, etc. </li></ul><ul><li>Transcends people and time; offers data permanence and creates institutional memory </li></ul><ul><li>Saves time, money, frustration  </li></ul>
    17. 17. The future … <ul><li>Web-based system – KnowledgeTree </li></ul><ul><li>KnowledgeTree – a DMS that manage the document lifecycle, promote collaboration and ensure compliance </li></ul><ul><li>Why KnowledgeTree </li></ul><ul><ul><li>Customizable metadata capture </li></ul></ul><ul><ul><li>Web-based access </li></ul></ul><ul><ul><li>Role and group-based permissions </li></ul></ul><ul><ul><li>RSS feeds and subscription integration </li></ul></ul><ul><ul><li>Open-source – customizable </li></ul></ul><ul><ul><li>Workflows integration </li></ul></ul><ul><ul><li>Notifications and alerts </li></ul></ul><ul><ul><li>Version control </li></ul></ul>
    18. 18. Folder structure
    19. 19. Metadata capture <ul><li>Customizable metadata </li></ul>
    20. 20. Protection How ICT can Help
    21. 21. Would your data recover from any one of these?
    22. 22. Levels of Risk <ul><li>nnn </li></ul><ul><li>Level 1 : incident management </li></ul><ul><ul><li>Equipment failure, data loss or corruption </li></ul></ul><ul><li>Level 2 : local relocation of ICT resources </li></ul><ul><ul><li>Fire, flood, or similar local disaster </li></ul></ul><ul><li>Level 3 : relocation within country </li></ul><ul><ul><li>Earthquake, major fire or flood </li></ul></ul><ul><li>Level 4 : relocation to a different country </li></ul><ul><ul><li>Civil unrest, major natural disaster </li></ul></ul>
    23. 23. Backup Strategies Strategy 1 – Single copy on online storage Protects risk level 1 Strategy 2 – Additional copy on offsite storage Protects risk level 1 and 2 . Strategy 3 – Addition copy at a 3 rd location out-of-country. Protects risk level 1, 2 and 3 Strategy 4 – Multiple copies, frequency and retention period decided by the data owner Protects risks 1, 2, 3 and 4
    24. 24. Thank you More Discussion