1. The Archive, Big Data, and Security
Or
Why do dusty documents really matter?
Tim Gollins
Head of Digital Preservation
(The National Archives)
&
Honorary Research Fellow
(Glasgow University School of Computing Science)
2. Outline
●
The Archive is Big Data
●
The Archive is about our Security
●
Sensitivity Review of Digital Records
– Both a Threat and an Opportunity to
The Archive and thus to our Security
3. The Archive is Big Data
●
Big data is not new
– Data of a volume that is transformative
●
Medieval Times
– The Master of The Rolls
●
Over 1 Billion sheets of paper (records)
●
Already over 1.5 Billion web pages
●
Capacity for over 13 Petabytes of Digital
Records
4. The Archive is Security
●
Security rests on the Citizen's Trust in the
State
●
The Archive underpins the fabric of our society
●
Enables Trust
– Is the impartial witness
– Holds the executive to account - the court of
history
●
Fundamental to The Rule of Law
– Underpins many of Lord Bingham’s principles
– E.g. “Ministers and pubic officers at all levels must
exercise the powers conferred on them in good
faith, fairly, for the purpose for which the powers
5. Transfer To The Archive
●
Complex and Opaque process
– Decisions can appear perverse
– Checks and Balances
– Involving an “Advisory Council”
– The “Lord Chancellor's Blanket” (Blue?)
●
Journalists and Eminent Historians are
questioning the process
●
Conspiracy Theorists Ply Their Trade
6. Digital Sensitivity Review
●
Threat
– Volume & Resources
– Complexity - Content and Containers
– Risk – specifics are now easy to find
– Decisions – The Rule of Law
– Timing – transition from 20 to 30 years
●
Opportunity
– Some things are easier – but search can overload
– Constancy
– Efficiencies possible
– Technically Assisted Digital Sensitivity Review
7. Conclusion - The Right Balance
●
Freedom of Information not just openness
●
Openness & Transparency of process
●
Calls for Privacy - keep “my” data private
●
Calls for Openness - what is done in “my” name
●
Need for limits
– National Security
– Protection of individuals from harm
●
Digital Records Makes it Much harder
●
Clear need for Research
– and the means to conduct it
Editor's Notes
The Cuneiform tablets in Babylon – including instructions to build an Ark
The Library and Alexandria,
Large collections of records have always been transfomative and thus I would regard as “big data” of their time.
From the 12 century the duties of a clerk responsible for the “rolls” is worthy of mention
Explicitly known as the “Master of the Rolls” by the 15th Century
The holder of that post now chairs the Lord Chancellor’s Advisory Council that assures the transfer of records to the archives
The Paper Holdings at Kew are over 1 Billion pages (1000 years of documents)
The UK government web archive (Less than 20 years of material – most in the last 5) over 1.5 Billion pages – 18 months ago
Security relies on the trust of the citizen in the state
It is about The Rule of Law and the fact that the executive cannot be above that rule
For the UK it is about the very fabric of our society
The British state is different from many others in that the citizen expects the state to be subservient to her rather than the more common case
The Rule of law supports and empowers the citizen
National Archives are fundamental to all of this
They provide the impartial witness that enable the holding to account under the rule of law and in the court of history
Bingham’s 4th Principle – accountability for the executive – how can we know what they have done if the records are not kept ?
It follows that the citizen must therefore trust the process by which the archives receives its material to sustain his rights.
The UK system requires that selection, appraisal and sensitivity review is carried out by the department – this is counter intuitive as it appears to allow the department to hide material that it wishes not to see the light of day!
however in creating this system the great archivist, Jenkinson, who articulated many of the fundamentals of the UK system was trying to ensure that, unlike the Natzi Archive that was complicit the Holacaust, the UK archive was able to guard its independence under the rule of law.
There are checks and balances,
The right of access under FOI is the first,
The Second is the public visibility of the selection criteria that the departments must apply,
The Third is the Archives oversight of the application of those criteria,
The Fourth is the Lord chancellors Advisory council's oversight of the application of FOI exemption during sensitivity review and their role in ensuring the timely transmission
But what colour is the “Lord Chancellor’s Blanket” ?
Professor Margaret MacMillan, warden of St Antony's College, Oxford, Quoted in the Guardian : "I am one of many historians who has benefited from using the British archives and who had confidence that the documents had not been weeded to suit particular interests. Now I am wondering whether I will have to go back and rethink my work on such matters as the outbreak of the first world war or the peace conference at the end. But when are we going to get the complete records? So far the pace of transferring them is stately, to put it politely."
It looks like we have something to hide, and such appearances are important.
Volume and Resources: Following advance of office technology during the twentieth century and the broadening of the interest of the scholarly community a much greater volume of material is being deemed worthy of preservation in the digital age. Against a background of budgetary constraint manual review of digitally born records is not practical.
Complex Context: Across government and elsewhere the impact of technology has eroded earlier clear and unambiguous rules for the creation and management of information. This was very obvious in the evidence presented to the Hutton Inquiry [4], where the paper trail for a decision was no longer in a single manila file; instead, the record was found in a blizzard of emails sent from person to person and stored on multiple computing systems [5]. This situation will significantly complicate digital sensitivity review, as understanding a record’s context (including its distribution) is crucial in assessing its sensitivity.
Risk: These challenges for review also occur in a context of significantly increased risk. Although the consequences of mistaken disclosure have not changed with the advent of digital records, the probability of discovering a mistake has. It is hard to discover particular information in the paper world, in marked contrast to the digital environment where ubiquitous search engines index content rapidly. A risk-averse depositor may feel obliged to close large swathes of records if they cannot efficiently and effectively determine the sensitivity of each individual record with some clear degree of certainty.
Defensible Decisions: The risk environment is further complicated by the fact all closures of public records are open to challenge through FOIA, appeal to the Information Commissioner and ultimately in the courts. This means that the Digital Sensitivity Review process must produce decisions which stand up to external scrutiny and with which Lord Chancellor’s Advisory Council and audit and risk management committees both inside and outside the public sector are comfortable.
Fundamental difference between openness (driven by what the state wants you to see) and Freedom of information which proscribes you right to see while creating a balance between the public interest, the sate interest and the personal interest based on Human Rights and the Rule of Law.
Concept of FOI framework is fundamentally sound (in my view), although of course the details must always be open to debate, they are open to public scrutiny and The Rule of Law.
Openness of process – Drawing on Bingham's second principle “Questions of legal right and Liability should ordinarily be resolved by application of the law and not the exercise of discretion” - and earlier principles “All persons and authorities within the state, whether public or private, should be bound by and entitled to the benefit of laws publically made, taking effect (generally) in the future and publically administered in the courts” - I am of course extrapolating and generalising but I think reasonably.
It is about trust engendered by right being seen to be done!