Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

д-р Лючиана Дюранти – Расширенная версия презентации на английском языке к семинару в Москве 23 сентября 2013 года

Презентация доклада д-ра Лючианы Дюранти о проблемах обеспечения аутентичности электронных документов и доверия к ним, а также о руководимом ею новом международном проекте InterPARES Trust. Презентация была подготовлена к организованному компанией "Электронные Офисные Системы" семинару в Москве 23 сентября 2013 года. Расширенная версия содержит значительное количество дополнительных материалов и предназначена для более глубокого ознакомления участников семинара с рассматриваемыми на нём вопросами.

Presentation of Dr. Luciana Duranti (Director, Centre for the International Study of Contemporary Records and Archives, University of British Columbia; Director, InterPARES Trust project) on the authenticity and trust in electronic environment and on InterPARES Trust project. The presentation was prepared for the seminar in Moscow on September 23, 2013. The extended version contains numerous additional materials which are helpful for deeper understanding of the issues in question.

  • Login to see the comments

  • Be the first to like this

д-р Лючиана Дюранти – Расширенная версия презентации на английском языке к семинару в Москве 23 сентября 2013 года

  1. 1. InterPARES Trust Prof. Dr. Luciana Duranti Project Director The University of British Columbia Canada
  2. 2. The Overall Challenge • The nature of digital records • Establishing digital records accuracy, reliability and authenticity and maintaining it over time so that it can be proven • Developing an infrastructure that ensures a seamless controlled flow of authentic data/documents/records from the creator to the preserver irrespective of changes in technology • Providing transparency while protecting secrecy where warranted • Ensuring that the conflicting rights of users, clients, employees, and future generations are protected • Ensuring the permanent preservation of the documentary cultural heritage in digital form
  3. 3. Principles to Respect • Technology cannot determine the solution to the reliable and accurate creation of digital records or to their authentic preservation over the long term: the professional’s needs define the problem and archival principles must establish the correctness and adequacy of each technical solution • Solutions to the digital records challenges are inherently dynamic and specific to the cultural, disciplinary, administrative and legal situations • Preservation is a continuous process that begins with records creation
  4. 4. Digital vs. Traditional Records In the digital environment: • Record content, structure and form are no longer inextricably linked • The stored entity is distinct from its manifestation and its digital presentation has to be considered as well as its documentary one • When we save a record, we take it apart in its digital components, and when we retrieve it, we reproduce it (it is not possible to preserve a digital record, only the ability to reproduce or recreate it) Therefore, we can no longer determine authenticity on the objectrecord, which is composite (stored + manifested) and permanently new (re-production), but must make an inference of authenticity from its environment of creation, maintenance & use and preservation.
  5. 5. Records Online Furthermore, increasingly individuals and organizations choose to keep their records on line. The primary uses of the online environment are: •Backup •Collaboration •Distribution •Recordkeeping •Long-term storage •Keeping Archives •Email storage is number one.
  6. 6. Motivations
  7. 7. Internet vs Cloud Often the Internet is referred to as the Cloud. Technically this is a misuse of terms. I will use the term Internet provider to refer to “entities providing users the ability to communicate through a computer system that processes or stores computer data on behalf of such communication or users.” (Budapest Convention on Cybercrime, 2001). Therefore, there are three “actions” related to the definition of provider: communication, data processing and data storage. However, the term Cloud is useful because it conveys the nebulous nature of what happens on the Internet, and the fact that, differently from other industries presenting similar characteristics, like the aero-spatial one, the services offered on the Internet are not much regulated nor are they transparent.
  8. 8. Trust on the Internet • In fact we know very little about what happens on the Internet. The standard of trustworthiness for it is that of the ordinary marketplace, caveat emptor, or buyer beware • Trust is defined in legal theory as a relationship of voluntary vulnerability, dependence and reliance, based on risk assessment • The nature of trust relationships on the Internet is fraught with risks, weaknesses, and fault-lines inherent in the management of records and their storage in rapidly changing technologies where authorship, ownership, and jurisdiction may be questioned.
  9. 9. What is involved in Trust? • In business, trust involves confidence of one party in another, based on alignment of value systems with respect to specific benefits • In everyday life, trust involves acting without the knowledge needed to act. It consists of substituting the information that one does not have with other information • Trust is also a matter of perception and it is often rooted in old mechanisms which may lead us to trust untrustworthy entities
  10. 10. Whom Do We Trust? • We trust banks, phone companies, hospitals, government, etc. to keep and maintain digital data, records, archives about us or belonging to us on our behalf. However, where those records actually reside, how well they are being managed, how long they will be available to us...we have no idea! • Nothing wrong with it. After all, we trust airplanes to fly us safely without any need to know the pilot, and we trust banks to manage our money, and hospitals to care for our health. • What would be different in putting trust in the Internet?
  11. 11. Questions We Should Be Asking • • • • • • How can confidentiality of records and data privacy be protected in the Internet? How can forensic readiness of an organization be maintained, compliance ensured, and e-discovery requests fully met? How can an organization’s records accuracy, reliability, and authenticity be guaranteed and verifiable? How can an organization’s records and information security be enforced? How can an organization maintain governance upon the records entrusted to the Internet? How can the preservation of records of permanent value be ensured?
  12. 12. The Classic Response • Choosing the Internet is a Risk Assessment decision where Risk = probability x impact. It is a question of comparison. If one cannot have everything, what does one give up? • The first choice offered us is between Transparency and Security: the Internet offers “trust through technology.” Security involves location independence: a core aspect of Internet services delivery models. • The second choice offered us is between Control and Economy: the Internet offers “trust through control on expenditures.” • But there is a necessary tension between laws that protect records in a traditional way and the abdication of custody and process without responsibility. Many are aware of this tension.
  13. 13. Benefits Reduced Costs ü No owning of hardware/software, so no huge upfront costs. ü Lower energy costs. ü Reduced IT personnel costs, as they don’t have to implement or maintain a Record Keeping System. ü Even in a private cloud, shared-tenant system allows pooling of resources to get more for less-better hardware/software and network.
  14. 14. Benefits Scalability ü You can get whatever you need, and only pay for what you use. ü You can track and measure use.
  15. 15. Benefits Reliability ü Always there on demand, big or small. ü Available from anywhere, using a browser.
  16. 16. Benefits Security ü Security can more robust than any one organization or unit could afford otherwise-both physical and virtual. ü Data sharding and data obfuscation requires a critical mass of data and complex technologies ü Centralized control on data easier to secure.
  17. 17. Benefits Collaboration ü Allows for easy collaboration as all files are in consistent format, viewed in web browser. ü Can access and distribute information across distant geographic areas. ü Think Google Docs, Dropbox.
  18. 18. Risks Cost Issues ü If you calculate transfer, implementation and subscription, costs are not insignificant. One can get unexpected license fees. ü Variability of costs-no set monthly fee. ü There is a significant per-request charge, to motivate access in large chunks. ü In Amazon, for example, although you are allowed to access 5% of your data each month with no per-byte charge, the details are complex and hard to model, and the cost of going above your allowance is high.
  19. 19. Risks Provider Reliability Issues ü Public providers can go bankrupt, disappear or be sold. Your records might be gone. ü Public and private providers can lose records, and sometimes can’t get them back or backups fail.
  20. 20. Risks Security Issues ü Unauthorized access, sub contractors, hackers. It is not a matter of if but when a breach will occur. Are you told when it does? ü Documents can be stored anywhere and can be moved at any time-without you knowing. ü Encryption might not be done-in transit or in cloud. A security firm found last month that nearly 16% of the Amazon directories in which business customers store data could be perused by anyone online, revealing thousands of files containing sales records, passwords and personal data. It is a relatively new technology accessible to non-technical users. ü Shared servers could intermingle information. ü Law enforcement may seize servers for 1 person’s actions. If 50 persons used it, it may take them days to get access to their records.
  21. 21. Risks Control ü You have no real control over records online. ü No control over who shares your servers with you or to whom services are delegated. ü Terms of service or privacy policy may change. ü Backup may be done without you knowing and may not be disposed of as needed ü Records might be deleted without you knowing or may not be deleted according to the retention schedule.
  22. 22. Risks Control #2 ü You do not know what happens when hardware/software become obsolete ü You can’t always move or remove records (e.g. for transfer to archives). ü Audit is not allowed. ü Termination of contract: records portability and continuity ü Termination of provider: records sustainability
  23. 23. Risks Transparency ü Chain of custody is not demonstrable ü Records reliability cannot be inferred from known processes ü Tampering possible on the Internet, so records authenticity cannot be inferred ü Records on the Internet cannot have forensic integrity (repeatability, verifiability, objectivity) ü Can then records be admissible as evidence in a court of law?
  24. 24. Risks Privacy Risks ü EU Data Protection Directive deals with privacy. It regulates processing of personal data in EU. One can’t transfer personal information (or its processing) of EU residents to countries that don’t have similar privacy protection (like the US—regardless of the Safe Harbout clause). ü EU is developing a right to be forgotten directive. Can le droit à l'oublie be protected?
  25. 25. Risks Legal Risks ü Geographic location of information-jurisdiction issues (loss of location). ü Trade secrets-are they still secret when shared with a provider? ü Legal privilege-is it still applicable if a provider can access the records? ü US Patriot Act-FBI can get court orders under Section 215. ü Can you isolate documents for legal hold? ü If multiple copies exist in different locations, which is the authoritative one? ü How can its authority be certified?
  26. 26. Risks Legal Risks: Metadata ü how does metadata follow or trace records in the cloud? ü how is this metadata migrated as a recordkeeping activity over time? ü who owns the metadata, especially metadata created by the service providers related to their management of your records and data? ü Is metadata intellectual property? Whose? ü How can this metadata be accessed for court and what are the responsibilities of the provider in cases of legal discovery or hold?
  27. 27. The Trust Challenge If we decide to carry out our activities online, we must find a balance between trust and trustworthiness, which is needed to ensure a balanced trust relationship. Trust constitutes a risk which can only be mitigated by the establishment of a trust balance: we must trust trustworthy trustees and trustworthy records.
  28. 28. Trust & Its Rules Trust involves acting without the knowledge needed to act. It consists of substituting the information that one does not have with other information. The rules of trust refer to those who give trust as well as to those who receive trust: trusters [givers] and trustees [receivers] The trust-bond between trusters and trustees is usually based on four characteristics of the trustees
  29. 29. Characteristics of Trustees • reputation, which results from an evaluation of the trustee’s past actions and conduct; • performance, which is the relationship between the trustee’s present actions and the conduct required to fulfill his or her current responsibilities as specified by the truster; • confidence, which is an assur-ance of expectation of action and conduct the truster has in the trustee; and • compe-tence, which consists of having the knowledge, skills, talents, and traits required to be able to perform a task to any given standard
  30. 30. Trust & Authenticity • In the digital environment authenticity is an inference based on foundation evidence and, in some measure, on confidence in the performance and competence of the keeper of the material, based on its reputation. • The level of trust required is proportional to the sensitivity of the material to be trusted as authentic and the adverse consequences of its lack or loss of trustworthiness. • To guarantee the authenticity of digital records requires intentional action or intervention by trusted entities imbued with accountability, but also an adequate framework of policies, procedures, and technologies. This has always been the case.
  31. 31. Trustworthy Record: More Than An Authentic Record Reliability: The trustworthiness of a record as a statement of fact, based on the competence of its author, its completeness, and the controls on its creation Accuracy: The correctness and precision of a record’s content, based on the above, and on the controls on content recording and transmission Authenticity: The trustworthiness of a record that is what it purports to be, untampered with and uncorrupted, based on its identity and integrity, and on the reliability of the records system in which it resides
  32. 32. Reliability Reliability: the source of the record is the key, defined in a way that points primarily to a reliable person and procedure (for computer stored documents) or a reliable process and software (for computer generated documents), or both. The software should be open source, because the processes of records creation and maintenance can be authenticated either • by describing the process or system used to produce a result or • by showing that the process or system produces an accurate result
  33. 33. Accuracy Digital entities are guaranteed accurate if they are repeatable. Repeatability, which is one of the fundamental precepts of digital forensics, is supported by the documentation of each and every action carried out on the record. Open source software is again the best choice for assessing accuracy, especially when conversion or migration occurs, because it allows for a practical demonstration that nothing could be altered, lost, planted, or destroyed in the process
  34. 34. Authenticity Context: The procedural, documentary and technological environment in which the record was created and used overtime Identity: The whole of the attributes of a record that characterize it as unique, and that distinguish it from other records (e.g. date, author, addressee, subject, identifier). Integrity: A record has integrity if the message it is meant to communicate in order to achieve its purpose is unaltered (e.g. text and form fidelity, absence of technical changes).
  35. 35. Integrity The quality of being complete and unaltered in all essential respects. We were never fussy about it. What if a letter had holes, or was burned on the side or the ink passed through? The same definition used with respect to data, documents, records, copies, records systems As long as it was good enough...but how good is good enough in the digital environment?
  36. 36. Data Integrity Based on Bitwise Integrity: the fact that data are not modified either intentionally or accidentally “without proper authorization.” • The original bits are in a complete and unaltered state from the time of capture, that is, they have the exact and same order and value • Small change in a bit means a very different value presented on the screen or action taken in a program or database.
  37. 37. Loss of Fidelity: Analog vs. Digital
  38. 38. Loss of Fidelity (cont.) • If Original Bits 101 • Change state to 110 • Continues to a 011 • Same bits, but Different value
  39. 39. Protecting Records From Data Alteration • Intentional alteration preventable through permission and access controls • Accidental alteration avoidance requires that additional hardware and/or software be in place • Requires method of determining if the record has been altered, maliciously or otherwise • Cannot rely on file size, dates or other file properties • We need audit logs and strong methods like Checksum and HASH Algorithms
  40. 40. Duplication Integrity The fact that, given a data set, the process of creating a duplicate of the data does not modify the data, and the duplicate is an exact bit copy of the original data set. Time stamps are useful to support it. Disk Image: a bit by bit reproduction of the storage medium. A full disk copy of the data on a storage device Different from a copy: a selective duplicate of files – You can only copy what you can see – Rarely includes confirmation of completeness – Moved as individual files – Provides incomplete picture of the digital device Issues linked to images: deleted files? Is the image a record?
  41. 41. Computer and System Integrity Computer integrity: the computer process produces accurate results when used and operated properly and it was so employed when the evidence was generated. System Integrity: a system performs its intended functions in an unimpaired manner, free from unauthorized manipulation whether intentional or accidental, and it did so when the evidence was generated and used. Both imply hardware and software integrity
  42. 42. Computer or System Integrity Protected by: • Sufficient security measures to prevent unauthorized or untracked access to the computers, networks, devices, or storage. • Stable physical devices that will maintain their ‘statefulness’ – the value they were given is maintained until authorized to change. – Users/permissions – Passwords – Firewalls – Logs
  43. 43. System Logs and Auditing Sets of files automatically created to track the actions taken, services run, or files accessed or modified, at what time, by whom and from where • Web logs (Client IP Address, Re quest Date/Time, Page Requested, HTTP Code, Bytes Sent, Browser Type, etc.) • Access logs (User account ID, User IP address, File Descriptor, Actions taken upon record, Unbind record, Closed connection) • Transaction logs (History of actions taken on a system to ensure Atomicity, Consistency, Isolation, Durability; Sequence number; Link to previous log; Transaction ID; Type; Updates, commits, aborts, completes)
  44. 44. Auditing Logs • Increasing required by law to demonstrate integrity of the system • Properly configured, restricted, provide checks and balances • Ability to determine effective security policies • Ability to trap errors that occur • Provide instantaneous notification of events • Monitor many systems and devices through ‘dashboards’ • Allow to determine accountability of people • Provide the necessary snapshot for post-event reconstruction (‘black-box’) • Answer Who-What-Where-When, but only if retained for sufficient time (space vs. money vs. risk vs. knowledge)
  45. 45. Assessment of Computer/System Integrity The assessment is based on repeatability, verifiability, objectivity and transparency An inference of computer/system integrity can be made based on the facts that: – the theory, procedure or process on which the design is based has been tested or cannot be tampered with – it has been subjected to peer review or publication (standard) – its known or potential error rate is acceptable – it is generally accepted within the relevant scientific community
  46. 46. Process Integrity Non-interference: the method used to gather, capture, use, manage and preserve digital data or records does not change the digital entities Identifiable interference: the method used does alter the entities, but the changes are identifiable These principles, which embody the ethical and professional stance of records and information managers, archivists, and digital forensics experts, are consistent with the impartial stance of a neutral third party, a trusted custodian
  47. 47. Authentication A means of declaring the authenticity of a record at one particular moment in time -- possibly without regard to other evidence of identity and integrity. Example: the digital signature. Functionally equivalent to seals (not to signatures): verifies record’s origin (identity); certifies record’s intactness (integrity); makes record indisputable and incontestable (non-repudiation). But, seals are associated with a person; digital signatures are associated with a person and a record. They are not a preferred means of authentication through time: they are preferred only across space.
  48. 48. Barriers to adoption of digital signatures • The technology of digital signatures is very complex • Digital signatures are useful for secure transmission of confidential intelligence, private data, financial transactions of relevant sizes, and other similar information, but it is overkill for everything else. • With routine records, the risk of doing damage is higher than the risks from which the digital signature is supposed to protect us • Digital signatures become obsolete very fast • Digital signatures do not allow long term preservation
  49. 49. Preferred Means of Authentication A chain of legitimate custody is ground for inferring authenticity and authenticate a record. Digital chain of custody: the information preserved about the record and its changes that shows specific data was in a particular state at a given date and time. A declaration made by an expert who bases it on the trustworthiness of the recordkeeping system and of the procedures controlling it (Information governance and quality assurance). However, the knowledge acquired through InterPARES and the Digital Records Forensics Project was not helpful in addressing the issues presented by records in an online environment, although it was vital for identifying the problems.
  50. 50. The Internet Community • The interconnectedness of the Internet is forcing us into one community without the benefit of gradually getting to know one another • As the United States developed the Internet, its social, political, economic views are reflected in its management, thereby rankling other countries • The Internet is evolving into more of an international entity, but what does that mean in terms of policies and practices regarding the handling of digital records residing with Internet services and social media providers? • This question, together with those identified earlier, is among those addressed by a new phase of the InterPARES research project
  51. 51. The Goal of InterPARES 1 and 2 (1999-2007) To develop the body of theory and methods necessary to ensure that digital records produced in databases and office systems as well as in dynamic, experiential and interactive systems in the course of artistic, scientific and e-government activities can be created in accurate and reliable form and maintained and preserved in authentic form, both in the long and the short term, for the use of those who created them and of society at large, regardless of technology obsolescence and media fragility.
  52. 52. Goal of InterPARES 3 (2007-2012) To enable public and private archival organizations and programs with limited resources to preserve over the long term authentic records that satisfy the requirements of their stakeholders and society’s needs for an adequate record of its past. It did so by building on the products of the first two phases of InterPARES (1998-2006)
  53. 53. Key IP 1 & 2 Products Policy Framework A framework of principles guiding the development of policies for records creating and preserving organizations
  54. 54. IP 1 & 2 Products Creator Guidelines Recommendations for making and maintaining digital materials for individuals and small communities of practice
  55. 55. IP 1 & 2 Products Preserver Guidelines Recommendations for digital preservation for archival institutions
  56. 56. IP 1 & 2 Products Benchmark and Baseline Requirements Authenticity requirements for assessing and maintaining the authenticity of digital records
  57. 57. IP 1 & 2 Products File Format Selection Guidelines Principles and criteria for adoption of file formats, wrappers and encoding schemes
  58. 58. IP 1 & 2 Products Terminology Database Including a glossary, a dictionary and ontologies
  59. 59. IP 1 & 2 Products Two Records Management Models Chain of Preservation (COP) Model (lifecycle) Business-driven Recordkeeping (BDR) Model (continuum)
  60. 60. IP 1 & 2 Final Products Two books: Luciana Duranti, ed. The Long-term Preservation of Authentic Electronic Records: Findings of the InterPARES Project (San Miniato: Archilab, 2005). Available on line at Luciana Duranti and Randy Preston, eds. InterPARES 2: Interactive, Dynamic and Experiential Records (Roma: ANAI, 2008). Available on line at
  61. 61. InterPARES 3 General Studies • • • • • • • Web 2.0/Social Media Terminology Database Analysis of Other Digital Preservation Projects International Standards Relevant to IP3 Annotated Bibliography Database E-mail Preservation Preservation of Registries
  62. 62. InterPARES 3 General Studies • • • • • • • • • • • National Standards Relevant to IP3 Community Archives e-Records Assessment Public Sector Audit Report for Digital Recordkeeping Records Management Policies and Procedures Template Cost-benefit Models Ethical Models File Viewers Education Modules Open Source Records Management Software Metadata Applications Profiles Organizational Culture & Risk Assessment
  63. 63. InterPARES Impact • Legislation: Italy, China • Standards: DOD 5015.2 (2007), MoReq 2 (2008), OAIS (2009) • Policies and Procedures: all participating countries in both public and private sectors • ICA Education Modules and Multilingual Archival Database
  64. 64. InterPARES Products All InterPARES Products are available at
  65. 65. A Community of Trust • People trust banks, phone companies, hospitals, government, etc. to keep and maintain their digital data/records/archives on their behalf. Where their records actually reside, how well they are being managed, how long they will be available to them... they have no idea! • Organizations are becoming concerned about a liability they may not have thought they were assuming. • Others are amassing huge volumes of data that they use to provide a host of services, many of which focus on marketing and securing competitive advantage: big data. • Big data also fosters a range of democratic objectives, from promoting government transparency to supporting research to contributing to public-private sector goals and priorities.
  66. 66. Trust on the Internet • Trust is a relationship of voluntary vulnerability, dependence and reliance based on risk assessment • The nature of trust relationships on the Internet is fraught with risks, weaknesses, and fault-lines inherent in the management of records in rapidly changing technologies where authorship, ownership, and jurisdiction may be questioned • In fact we know very little about what happens on the Internet, which is neither regulated nor transparent. The standard of trustworthiness for it is that of the ordinary marketplace, caveat emptor, or buyer beware.
  67. 67. What is involved in Trust? • In business, trust involves confidence of one party in another, based on alignment of value systems with respect to specific benefits • As said earlier, in everyday life, trust involves acting without the knowledge needed to act. It consists of substituting the information that one does not have with other information • Trust is also a matter of perception and it is often rooted in old mechanisms which may lead us to trust untrustworthy entities
  68. 68. The Trust Challenge If we decide to carry out our activities online, we must find a balance between trust and trustworthiness, which is needed to ensure a balanced trust relationship. Trust constitutes a risk which can only be mitigated by the establishment of a trust balance: we must trust trustworthy trustees and trustworthy records.
  69. 69. InterPARES Trust (2013-2018) The goal of InterPARES Trust is to generate the theoretical and methodological frameworks that will support the development of integrated and consistent local, national and international networks of policies, procedures, regulations, standards and legislation concerning digital records entrusted to the Internet, to ensure public trust grounded on evidence of good governance, a strong digital economy, and a persistent digital memory. InterPARES Trust is funded by a 5-year SSHRC Partnership grant and matching funds from UBC and all the partners (in cash and/or in kind)
  70. 70. InterPARES Trust Participants • The International Alliance comprises 7 Teams: § § § § § § § North America South America Europe Asia Australasia Africa Transnational Organizations • Supporting Partners • Pro-bono Consultants • International Alliance Steering Committee • Project Coordinator • Project Administrator • Project Technology Expert • Student Research Assistants Total : 195+ members and growing
  71. 71. Research Questions • How can confidentiality of organizational records and data privacy be protected? • How can forensic readiness of an organization be maintained, compliance ensured, and e-discovery requests fully met? • How can an organization’s records accuracy, reliability, and authenticity be guaranteed and verifiable? • How can an organization’s records and information security be enforced? • How can an organization maintain governance upon the records entrusted to the Internet?
  72. 72. Research Objectives • Building the foundations for establishing a relationship of trust between the people and those organizations that hold the records and data related to and/or belonging to them on the Internet • Ensuring the trustworthiness of data and records created in the interaction of people and organizations • Developing a supra-national framework embracing both developed and developing countries and all sectors, which is capable of guiding the development of domestic legislation and regulatory instruments that are consistent across cultures and societies
  73. 73. Theoretical Framework • • • • • • • • archival and diplomatics theory, in particular the ideas that are foundational to trusting records resource-based theory, which focuses on the importance of technical, managerial, and relational capabilities for leveraging resources to maximize competitive advantage risk management theory on “post-trust societies”, which represents an available body of knowledge for reflection and further investigation on the relationship between risk and trust, and risk management and trust management design theory, which adopts an argumentative process where an image of the problem and of the solution emerges gradually among the parties, as a product of incessant judgment, subjected to critical argument” human computer interaction, with its knowledge of human cognition, technological capabilities, networking, and human computer engagement digital records forensics theory theories of measurement and calculation, and psychology of symbology, presentation and interpretation of trust labels.
  74. 74. Methods In the first 4 years, research data will result from 1. a close analysis of the services offered on the Internet, as well as the technology that supports such services 2. a study of relevant law and case law, regulations and standards, 3. a combination of surveys and interviews of Providers and existing users of Internet services; and 4. case studies and general studies. We will focus on gathering, analyzing and interpreting data from a wide cross-section of organizations and institutions in order to explore the nature of trust relationships on the Internet, and the risks, weaknesses, and fault-lines inherent in record management and storage in rapidly changing technologies where authorship, ownership, and jurisdiction may be questioned.
  75. 75. Methods (cont.) At the conclusion of each study the results may be represented using activity and entity modeling, an analytic tool that enables understanding of the situational realities and work processes before and after modifications have been introduced to address problems. We will use diplomatic and archival analysis, digital records forensic analysis, and textual analysis, as well as visual analytics. We will employ comparative analysis to generate a theory of trust in cloud environments that transcends national and jurisdictional boundaries, and on that basis identify ways of addressing the challenges evidenced by modeling and visualization. After having identified solutions, we will draft model policies, procedures, and processes, and ask the test bed partners to test them.
  76. 76. Working Groups
  77. 77. Infrastructure Domain • Technology/Mechanisms/Services • Issues specific to types of infrastructure • Reliability of infrastructure (e.g. obsolescence, continuing access, sustainability) • Types of contractual agreements • Costs
  78. 78. Infrastructure: Proposed Studies • Contract Terms for Cloud-based Record Keeping Services Cloud-based services (CBS) and the technological infrastructure (s/w, h/w) are primarily set by the vendors of these types of services and secondarily by the purchaser’s needs/expectations. Terms of contracts for CBS thus represent interests from two perspectives: i) the service provider; ii) the purchaser. Through empirical analysis, the research will categorize: a) terms found in available contracts relating to record keeping requirements in terms of commonality or frequency of appearance; b) types of services purchased; c) types of technological infrastructure. It will determine, to the degree possible, whether the terms represent primarily the interests of the service provider or the purchaser. It will relate the terms, to the degree possible, to types of organizations, e.g., government, health sector, financial sector, etc.,
  79. 79. Infrastructure: Proposed Studies • Sensors in the Cloud We intend to look at digital data provenance (Buneman, 2000) issues specific to mobile sensors to develop and carry out a risk assessment related to issues of interest to the InterPARES Trust project as they arise in a specific application of mobile sensing that is being developed at MAGIC. The questions we would like to address include the following: • What are potential data provenance issues when dealing with mobile sensors? • What are the ways we can ameliorate potential risks associated with mobile sensors to make them more trustworthy?
  80. 80. Protection Domain • Methods: Encryption, sharding, obfuscation, geographic location, etc. • Breaches • Cybercrime • Servers sharing • Information Assurance • Governance • Audit
  81. 81. Protection: Proposed Studies • Standard of practice for trust in protection of authoritative records in government archives Risk management decisions need to be made. For the protection arena, these are decisions of kind rather than amount, and have to last over long periods. As such they are architectural in nature. The objective of this research effort is to build a global consensus around a limited set of these decisions for government-level systems of records and archives. In essence, this will create a standard of practice for risk management in authoritative archives.
  82. 82. Control Domain • • • • • • • Integrity Metadata Chain of custody Retention and disposition Transfer and acquisition Intellectual control Use control Preservation
  83. 83. Control: Proposed Studies • Model the Chain of Preservation for Records Entrusted to the Internet The modeling project will address the following questions: • Are requirements for the preservation of electronic records applicable to those entrusted to the Internet; do any of them need to be adapted? Are there other, special requirements for preservation of records entrusted to the Internet? • How can these requirements be satisfied when records are stored in cloud services? • Are there special requirements for records that are discovered and delivered via the Internet, even if they are not stored in a cloud? How can such requirements be implemented?
  84. 84. Control: Proposed Studies • The calculus of trust in records This study will identify a range of methods and limit cases for evaluating authenticity parameters based on authenticity parameters of inputs, examine the provenance issue, calculate how much metadata may be required to provide all of the relevant facts upon which calculation of authenticity of a record and associated claim may be done, augment these results to deal with changes in parameters based on later evaluations of information, and will look at theories of measurement and calculation methods for a small number of case studies, presentation methods, symbologies, and the psychology of presentation and interpretation of trust labels.
  85. 85. Control: Proposed Studies • Retention & Disposition in a Cloud Environment One approach under consideration is to develop a set of RM best practices (i.e., what we believe the answers will be) and then ask questions of the providers and measure their responses/knowns against the best practice. However, some of the questions for which we wish to find answers may not lend themselves to that format and would need to be included as open ended questions. For now we have a list of questions for which we would seek answers: • What would we need to know if we moved to cloud in 3 years time? • What makes it different from other types of remote storage/data base environments? • What do organizations need to tell cloud service providers to do? What are the minimum standards for retention and disposition?
  86. 86. Access Domain • • • • • • • • Open data/big data/open government/FIPPA/etc. Searchability/Usability Traceability Transparency Accountability The right to remember Privacy The right to be forgotten
  87. 87. Legal Domain • • • • • • • • Legal Privilege Privacy/Secrecy Intellectual rights Chain of evidence Admissibility/Weight Authentication Certification Contractual rules (e.g. safe harbour)
  88. 88. Terminology Cross-domain • • • • Multilingual glossary Multilingual dictionary with sources Ontologies as needed Essays explaining the use of terms and concepts within the project
  89. 89. Terminology: Proposed Studies • Big Data, Open Government, and Open Data - their evolution Big Data, Open Data and Open Government are having a substantial impact on the online environment. The evolution and characteristics of these relatively recent themes are poorly understood, especially from a recordkeeping perspective. This lack of understanding will inhibit the effective undertaking of research projects that address the creation and management of digital records generated in these environments. Each of the themes is reflecting recordkeeping issues that need to be understood if ITrust research projects are to be relevant and effective. The inter-relationships among the three themes suggest that they may be experiencing the same or similar recordkeeping issues. Understanding the processes for establishing and managing Big Data, Open Data and Open Government initiatives will support ITrust research and help in the development of recordkeeping policies, standards, and practices for managing digital records in the online environment.
  90. 90. Terminology: Proposed Studies • Core Terminology for InterPARES Trust How will InterPARES Trust define fundamental concepts, how are they understood in various contexts, and how do they relate to each other? Terms identifying such concepts, not defined in previous InterPARES terminology databases, have already surfaced at the initial meeting in Vancouver and in subsequent email, and include the following: • • • • • • • • • big data cloud (as distinct from the Internet), both public and private data sets Internet (as distinct from the cloud) open access open data open government platform as service trust
  91. 91. Resources Cross-domain • Annotated bibliographies: – published articles, books, etc. – case law – policies – statutes – standards – blogs and similar grey literature
  92. 92. Policy Cross-domain • In depth analysis of existing policies relevant to all 5 domains, as well as regulations, procedures, standard agreements, etc.
  93. 93. Policy: Proposed Studies • Establishing retention and disposition specifications and schedules in a digital environment Issues being addressed: Impact of the digital environment on establishing retention and disposition specifications and schedules for digital records; and methods for developing and applying specifications and schedules. The objective is to develop recommendations on the establishment and implementation of retention and disposition specifications and schedules for digital records
  94. 94. Social Issues Cross Domain Analysis of social change consequent to the use of the Internet, including but not limited to – use/misuse of social media of all types – trustworthiness of news – data leaks (intentional or accidental/Force majeure) consequences – development issues (power balance in a global perspective) – organizational culture issues – individual behaviour issues
  95. 95. Social Issues: Proposed Studies • Historical Study of Cloud-based Services Identify, to the degree possible, those CBS that suffered significant loss of trust by the user community. From this subset of CBS, the research would assess the basis for that loss of trust and, where applicable, the service provider recovered/restored trust or why the user community renewed its trust in the service(s). • Social Media The first phase of the project will explore the types of social media initiatives undertaken by 5-10 government organizations (number TBD) in the US and an equal number in Canada to determine how they utilize social media to engage citizens and provide customer service, as well as how the public reacts to those initiatives. The ultimate goal of this research project is to develop two or more case studies that highlight the citizen experience with government social media tools, customer experience, and issues of trust.
  96. 96. Social Issues: Proposed Studies • Putting the 'Fun' back in 'Functional‘ This project will explore some of the socio-technical factors that appear to affect the management of written and non-written information in organizations. It is based on the assumption that the social (i.e., cultural, historical, political, ideological, economic, ethical, linguistic, rhetorical, epistemological,… in one word, human) interactions that are involved in using available technologies shape and are shaped by the technologies used. In particular, we are interested in understanding how people engage with the information they create/use to accomplish their work in networked environments.
  97. 97. Education Cross-domain Development of different models of curricula for transmitting the new knowledge produced by the project
  98. 98. Outcomes This project intends to generate • new knowledge on digital records maintained online and accessed from all sorts of fix and mobile devices • shared methods for identifying and protecting the balance between privacy and access, secrecy and transparency, the right to know and the right to be forgotten • legislative recommendations related to e-evidence, cybercrime, identity, security, e-commerce, intellectual property, e-discovery and privacy • a model international statute specific to the Internet and recommendations for each government’s continued development of its current fleet of uniform statutes.
  99. 99. A Balance of Trust In the last year of the project, the activity with the greatest impact will be the development of trust relationships models, which will be iterative, as we will be working towards resolution of issues as they present themselves, with the aim of developing solutions framed as a balance of trust. To establish a “balance of trust” requires enabling the development of trustworthy technologies, procedures, and contractual conditions. We will do so by • • identifying the changes needed in our paradigms of trust in data, records and records systems, and developing an internationally shared trust framework that both providers and users can live by, because the current framework within which law enforcement operates and security concerns are addressed is inconsistent within and across jurisdictional boundaries. Only then we can require and expect transparency, compliance and accountability, in addition to security and economy, and develop Trust in the Internet
  100. 100.