Supporting the preservation lifecycle in repositories

561 views

Published on

To accomplish effective digital preservation, repositories need to be able to incorporate processes such as planning, monitoring and preservation operations. These processes feed into each other and create a continuous cycle that allows a repository to detect opportunities and risks and act accordingly.
Each of these digital preservation processes have already been extensively studied and tools to support each process have already been developed, but many repository implementations still lack complete and continuous digital preservation features. In the presentation Luís Faria from KEEP SOLUTIONS presents a global view on digital preservation processes and how they fit together in a digital preservation cycle. Furthermore, he describes tools that support these processes and explains how to incrementally integrate them into digital repositories providing a complete systematic and semi-automatic digital preservation system.
The presentation was given on July 9, 2013, at Open Repositories 2013, Charlottetown, Canada.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
561
On SlideShare
0
From Embeds
0
Number of Embeds
163
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Supporting the preservation lifecycle in repositories

  1. 1. Luis%Faria%lfaria@keep.pt KEEP%SOLUTIONS%www.keep7solu:ons.com Open%Repositories%2013 CharloFetown,%PEI,%Canada,%2013707709 Suppor/ng2the2preserva/on2 lifecycle2in2repositories h"p://goo.gl/V6142
  2. 2. h8p://www.keep<solu/ons.com This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). KEEP$SOLUTIONS • Company2specialized2in2informa/on2management • Digital2preserva/on2experts • Open2source:2RODA,2KOHA,2DSpace,2Moodle,2etc. • Scien/fic2research • SCAPE:%large7scale%digital%preserva:on%environments • 4C:%digital%preserva:on%cost%modeling 2
  3. 3. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). KEEP$SOLUTIONS$research$partners 3
  4. 4. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). The$past:$RODA$1.0.0 • Presented%in%Open%Repositories%2009 • Open%source%digital%repository • Based%on%Fedora%Commons • Modern%web%interface • For%archives • For%digital%preserva:on 4
  5. 5. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). The$present:$RODA$Community • Adapted2to2be2a2true2open<source2project • For2users • Easy2to2install • Easy2to2test2(virtual2machine) • Support2mailing2lists2and2documenta/on • Free2or2paid2support • For2developers • Development2and2transla/on2guidelines • Easy2build2(maven) • Available2on2GitHub • Support2mailing2lists • Plenty2more2documenta/on • More2info:2h8p://www.roda<community.org 6
  6. 6. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Current$pracCce$problems • Repository%has%content • Organiza:on%has%policies%in%place%(e.g.%no%compression%allowed) 8 P1: Does the content conform to policies? Are there any risks? Even on a changing content, policies and environment? • Found%a%preserva:on%risk! P2: How to easily and trustworthily decide which action to take?
  7. 7. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Current$pracCce$problems • Content%grows%exponen:ally%in%volume,%heterogeneity%and% complexity 9 P4: How to do digital preservation in large-scale environments? • Know%what%ac:on%to%take P3: How to ensure and monitor the quality of chosen action and that the decision assumptions remain valid?
  8. 8. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle 10 Repository Environment and users access, ingest, harvest
  9. 9. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle 11 Watch monitored content and events monitored environment and users Repository Environment and users access, ingest, harvest
  10. 10. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle 12 Planning Watch create/re-evaluate plans monitored content and events monitored environment and users Repository Environment and users access, ingest, harvest
  11. 11. Planning Watch create/re-evaluate plans deploy plan monitored content and events monitored environment and users Repository Environment and users access, ingest, harvest Operations This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle 13
  12. 12. PlanningOperations Watch create/re-evaluate plans deploy plan execute action plan monitored actions monitored content and events monitored environment and users Repository Environment and users access, ingest, harvest This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle 14
  13. 13. PlanningOperations Watch create/re-evaluate plans deploy plan execute action plan monitored actions monitored content and events monitored environment and users Repository Environment and users access, ingest, harvest Policies This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle 15
  14. 14. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle$(in$pracCce) 16 Planning Operations Watch create/re-evaluate plans deploy planexecute action plan monitored actions monitored content and events monitored environment and users Repository Environment and users access, ingest, harvest Policies
  15. 15. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle$(in$pracCce) 17 Planning Operations Watch create/re-evaluate plans deploy planexecute action plan monitored actions monitored content and events monitored environment and users Repository Environment and users access, ingest, harvest Policies Scout Plato Workflow2engine
  16. 16. deploy plan Repository Environment and users access, ingest, harvest Scout Plato Scout Web UI & Email notification Notification API Report API Scout Adaptors Plan management API Data Connector API create/re-evaluate plans Plato Web UI monitored events and actions monitored content Workflow engine Workflow engine API Planner execute plan This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). SCAPE$PreservaCon$Suite 18
  17. 17. deploy plan Repository Environment and users access, ingest, harvest Scout Plato Scout Web UI & Email notification Notification API Report API Scout Adaptors Plan management API Data Connector API create/re-evaluate plans Plato Web UI monitored events and actions monitored content Workflow engine Workflow engine API Planner execute plan This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). SCAPE$PreservaCon$Suite 19 Small$scale:2Taverna2% Large$scale:2SCAPE2plaaorm hFp://www.taverna.org.uk hFp://wiki.opf7labs.org/display/SP/SCAPE+Plaorm
  18. 18. deploy plan Repository Environment and users access, ingest, harvest Scout Plato Scout Web UI & Email notification Notification API Report API Scout Adaptors Plan management API Data Connector API create/re-evaluate plans Plato Web UI monitored events and actions monitored content Workflow engine Workflow engine API Planner execute plan This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). SCAPE$PreservaCon$Suite 20 P1 P2 P3 P4 Automa:on%and%integra:on
  19. 19. SCAPE$PreservaCon$Suite Tools$and$APIs
  20. 20. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). SCAPE$Digital$object$model • Standard%model%for%represen:ng%digital%objects • Based%on%METS%and%PREMIS • Specifies%intellectual%en:ty%(SIP,%AIP%and%DIP) • Specifica:on:% hFps://github.com/openplanets/scape7plaorm7api 22
  21. 21. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Data$Connector$API 23 • Access%and%modify%content%on%the%repository • HTTP%REST%API • Methods: • Retrieve%%intellectual%en:ty,%metadata,%representa:on,%file%or% named%bit%stream • Ingest%intellectual%en:ty%(sync%or%async) • Update%intellectual%en:ty,%representa:on%or%file • Search%intellectual%en::es,%representa:ons%or%files%(SRU) • API%specifica:on:%hFps://github.com/openplanets/scape7plaorm7api • Ref.%implementa:on:%Fall%2013%in%Fedora%4%and%RODA
  22. 22. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Report$API • Provides%access%to%repository%events • Events: • Ingest%started%and%finished • Viewed$or%downloaded%descrip:ve%metadata%or%representa:on • Preserva:on%plan$executed • OAI7PMH%data%provider • PREMIS%events%metadata • Agent:%who%triggered%the%event • Date/:me:%when$did%the%event%occur • Details:%what%happened • API%specifica:on:%hFps://github.com/openplanets/scape7plaorm7api • Ref.%implementa:on:%%hFps://github.com/openplanets/roda 24
  23. 23. Scout:$a$preservaCon$watch$system This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). • Monitors%aspects%of%the%world%to% detect%preserva:on%risks%and% opportuni:es • Triple%store • Adaptors • Data%Connector%&%Report%API • SCAPE%Policy%model • PRONOM • Web%seman:c%extrac:on • Renderability%experiments • Web%interface • Triggers:%templates%and%SPARQL • Email%no:fica:ons • Demo:%hFp://scout.scape.keep.pt 25 Content Policies Web Scout Risk notification Human knowledge Registries hFp://openplanets.github.io/scout/
  24. 24. h8p://scout.scape.keep.pt
  25. 25. h8p://scout.scape.keep.pt
  26. 26. h8p://scout.scape.keep.pt
  27. 27. h8p://scout.scape.keep.pt
  28. 28. h8p://scout.scape.keep.pt
  29. 29. h8p://scout.scape.keep.pt
  30. 30. h8p://scout.scape.keep.pt
  31. 31. h8p://scout.scape.keep.pt
  32. 32. h8p://scout.scape.keep.pt
  33. 33. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Plan$management$API • Deploy2and2management2preserva/on2plans2in2the2repository • HTTP2REST2API • Methods: • Search2and2retrieve2plans • Deploy2a2new2plan • Retrieve2or2add2a2plan2execu/on2state2(in2progress,2success2or2fail) • Update$plan2lifecycle2status2(enabled2or2disabled) • Implementa/on2can2use:2 • Workflow2engine:2Taverna2or2SCAPE2plaaorm • Data2connector2API • API2specifica/on:2hFps://github.com/openplanets/scape7plaorm7api • Ref.2implementa/on:2Fall220132for2Fedora242and2RODA 27
  34. 34. Planning Content profile Policies Risks or Opportunities Environment information Define requirements Evaluate alternatives Analyse results Build preservation plan Preservation plan Action alternatives Operations Watch Representative sample content This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Plato:$a$preservaCon$planning$tool • Systema:c%planning • Traceable,%documented,% trustworthy • Integrated: • Data%Connector%API%(Content) • Scout%(Watch,2Content2profile,2sampling) • SCAPE%Policy%model • Plan%management%API%(Opera/ons) • Taverna%compa:ble%workflows 28 hFp://ifs.tuwien.ac.at/dp/plato
  35. 35. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). PreservaCon$lifecycle$(in$pracCce) 30 Planning Operations Watch create/re-evaluate plans deploy planexecute action plan monitored actions monitored content and events monitored environment and users Repository Environment and users access, ingest, harvest Policies Scout Plato Workflow2engine
  36. 36. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Conclusions 31 P1: Does the content conform to policies? Are there any risks? Even on a changing content, policies and environment? P2: How to easily and trustworthily decide which action to take? S1: Use Scout: preservation watch system S2: Use Plato: preservation planning tool
  37. 37. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Conclusions 32 P3: How to ensure and monitor the quality of chosen action and that the decision assumptions remain valid? P4: How to do digital preservation in large-scale environments? S3: Q&A in preservation plans (Plato), monitoring of Q&A (Report API & Scout), automatic Scout triggers created by Plato S4: Automation and end-to-end integration of preservation processes.
  38. 38. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Roadmap • Scout: • User%support • More%adaptors • More%trigger%templates • Plato: • Automa:c%create%Scout%triggers • Automa:c%deploy%using%plan%management%API • Repository%reference%implementa:ons:%RODA%and%Fedora%4 33
  39. 39. This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). Conclusions • All%APIs%published • Ref.%implementa:ons%in%RODA%and%Fedora%4%in%Fall%2013 • All%tools%available%in%Github 34 Add preservation to your repository now!
  40. 40. Luis%Faria%lfaria@keep.pt KEEP%SOLUTIONS%www.keep7solu:ons.com Open%Repositories%2013 CharloFetown,%PEI,%Canada,%2013707709 Suppor/ng2the2preserva/on2 lifecycle2in2repositories h"p://goo.gl/V6142

×