SlideShare a Scribd company logo
Designing Preservable
                Websites
                                Nicholas Taylor
                                 @nullhandle

DC, VA & MD Search Engine Marketing Meetup
July 18, 2012

                                        “found glass” by Flickr user nuanc under CC BY-NC-ND 2.0
why preserve the web?




copy of the first webpage
web archivists aren’t visible
       stakeholders



     design
                      archiving
     usage
search engine crawler ≠
    archival crawler




      “GoogleBots” by Flickr user ares64 under CC BY 2.0
what is a “preservable”
           website?




“Fish Preserver” by Flickr user ecstaticist under CC BY-NC-SA 2.0
three priorities:
• capture: can resources be acquired by
  current web archiving technologies?
• replay: can the user’s experience of
  the original website be recreated from
  the archived resources?
• preservation: how can it be assured
  that the archived website remains
  coherent over time?
follow web standards and
   accessibility guidelines




“Web Standards Fortune Cookie” by Flickr user mherzber under CC BY-SA 2.0
be careful with robots.txt
       exclusions




     robots.txt for Last.fm
use a site map, transparent
links, and contiguous navigation




    “Card sorting” by Flickr user Manchester Library under CC BY-SA 2.0
maintain stable URLs and
 redirect when necessary




     “Improvised detour sign” by Flickr user Jason McHuff under CC BY-SA 2.0
consider using a Creative
   Commons license




   “2500 Creative Commons Licenses” by Flickr user qthomasbower under CC BY-SA 2.0
use durable data formats




 “Lascaux cave painting” by Flickr user qoforchris under CC BY-ND 2.0
embed metadata, especially the
      character encoding




   source code of http://www.seo.com/
use archiving-friendly platform
     providers and CMSs




   robots.txt for Drupal 7
three tips
1. see how well your site
   validates on
   http://validator.w3.org/
2. see how your site looks
   on http://archive.org/
3. your favorite online
   sitemap generator is a
   good starting point




                              “Highlighters” by Flickr user KJGarbutt under CC BY-ND 2.0
thank you!

Nicholas Taylor
 @nullhandle

More Related Content

Similar to Designing Preservable Websites

Advocating for Web Archivability
Advocating for Web ArchivabilityAdvocating for Web Archivability
Advocating for Web Archivability
nullhandle
 
Tool Academy: Web Archiving
Tool Academy: Web ArchivingTool Academy: Web Archiving
Tool Academy: Web Archiving
nullhandle
 
From Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULFrom Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SUL
nullhandle
 
Web 2.0 : Intellectual Property Issues
Web 2.0 : Intellectual Property IssuesWeb 2.0 : Intellectual Property Issues
Web 2.0 : Intellectual Property Issues
Karl Larson
 
Workshop Barcelona: Introduction to Creative Commons
Workshop Barcelona: Introduction to Creative CommonsWorkshop Barcelona: Introduction to Creative Commons
Workshop Barcelona: Introduction to Creative Commons
OpenCourseWare Europe
 
Bulock Collection Management for OA Resources
Bulock Collection Management for OA ResourcesBulock Collection Management for OA Resources
Bulock Collection Management for OA Resources
National Information Standards Organization (NISO)
 
Perth Museums - Part 3 managing copyright material
Perth Museums - Part 3 managing copyright materialPerth Museums - Part 3 managing copyright material
Perth Museums - Part 3 managing copyright materialEllen Broad
 
Creative Commons and the CC BY license, Overview for 2013 OPEN Kick-off
Creative Commons and the CC BY license, Overview for 2013 OPEN Kick-offCreative Commons and the CC BY license, Overview for 2013 OPEN Kick-off
Creative Commons and the CC BY license, Overview for 2013 OPEN Kick-off
Jane Park
 
CC BY license implementation deep dive (OPEN Kick-off)
CC BY license implementation deep dive (OPEN Kick-off)CC BY license implementation deep dive (OPEN Kick-off)
CC BY license implementation deep dive (OPEN Kick-off)
Jane Park
 
Using the CC BY license, Workshop for 2013 OPEN Kick-off
Using the CC BY license, Workshop for 2013 OPEN Kick-offUsing the CC BY license, Workshop for 2013 OPEN Kick-off
Using the CC BY license, Workshop for 2013 OPEN Kick-off
Jane Park
 
State of CC Search (GS 2019)
State of CC Search (GS 2019)State of CC Search (GS 2019)
State of CC Search (GS 2019)
Jane Park
 
2015 03-11_todd-fritz_devnexus_2015
2015 03-11_todd-fritz_devnexus_20152015 03-11_todd-fritz_devnexus_2015
2015 03-11_todd-fritz_devnexus_2015
Todd Fritz
 
Share, Remix, Reuse: Creative commons in your library
Share, Remix, Reuse: Creative commons in your libraryShare, Remix, Reuse: Creative commons in your library
Share, Remix, Reuse: Creative commons in your library
Tiff Emerick
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
Web 2.0Web 2.0
Web 2.0 The Very Basics Remote
Web 2.0 The Very Basics RemoteWeb 2.0 The Very Basics Remote
Web 2.0 The Very Basics Remotebibliotecaria
 
CC and OER Presentation at Whipple Hill User Conference 09
CC and OER Presentation at Whipple Hill User Conference 09CC and OER Presentation at Whipple Hill User Conference 09
CC and OER Presentation at Whipple Hill User Conference 09Jane Park
 
Creative Commons Overview for UC San Diego Faculty
Creative Commons Overview for UC San Diego FacultyCreative Commons Overview for UC San Diego Faculty
Creative Commons Overview for UC San Diego Faculty
Jane Park
 
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
nullhandle
 

Similar to Designing Preservable Websites (20)

Advocating for Web Archivability
Advocating for Web ArchivabilityAdvocating for Web Archivability
Advocating for Web Archivability
 
Tool Academy: Web Archiving
Tool Academy: Web ArchivingTool Academy: Web Archiving
Tool Academy: Web Archiving
 
From Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULFrom Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SUL
 
Web 2.0 : Intellectual Property Issues
Web 2.0 : Intellectual Property IssuesWeb 2.0 : Intellectual Property Issues
Web 2.0 : Intellectual Property Issues
 
Workshop Barcelona: Introduction to Creative Commons
Workshop Barcelona: Introduction to Creative CommonsWorkshop Barcelona: Introduction to Creative Commons
Workshop Barcelona: Introduction to Creative Commons
 
Bulock Collection Management for OA Resources
Bulock Collection Management for OA ResourcesBulock Collection Management for OA Resources
Bulock Collection Management for OA Resources
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Perth Museums - Part 3 managing copyright material
Perth Museums - Part 3 managing copyright materialPerth Museums - Part 3 managing copyright material
Perth Museums - Part 3 managing copyright material
 
Creative Commons and the CC BY license, Overview for 2013 OPEN Kick-off
Creative Commons and the CC BY license, Overview for 2013 OPEN Kick-offCreative Commons and the CC BY license, Overview for 2013 OPEN Kick-off
Creative Commons and the CC BY license, Overview for 2013 OPEN Kick-off
 
CC BY license implementation deep dive (OPEN Kick-off)
CC BY license implementation deep dive (OPEN Kick-off)CC BY license implementation deep dive (OPEN Kick-off)
CC BY license implementation deep dive (OPEN Kick-off)
 
Using the CC BY license, Workshop for 2013 OPEN Kick-off
Using the CC BY license, Workshop for 2013 OPEN Kick-offUsing the CC BY license, Workshop for 2013 OPEN Kick-off
Using the CC BY license, Workshop for 2013 OPEN Kick-off
 
State of CC Search (GS 2019)
State of CC Search (GS 2019)State of CC Search (GS 2019)
State of CC Search (GS 2019)
 
2015 03-11_todd-fritz_devnexus_2015
2015 03-11_todd-fritz_devnexus_20152015 03-11_todd-fritz_devnexus_2015
2015 03-11_todd-fritz_devnexus_2015
 
Share, Remix, Reuse: Creative commons in your library
Share, Remix, Reuse: Creative commons in your libraryShare, Remix, Reuse: Creative commons in your library
Share, Remix, Reuse: Creative commons in your library
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Web 2.0 The Very Basics Remote
Web 2.0 The Very Basics RemoteWeb 2.0 The Very Basics Remote
Web 2.0 The Very Basics Remote
 
CC and OER Presentation at Whipple Hill User Conference 09
CC and OER Presentation at Whipple Hill User Conference 09CC and OER Presentation at Whipple Hill User Conference 09
CC and OER Presentation at Whipple Hill User Conference 09
 
Creative Commons Overview for UC San Diego Faculty
Creative Commons Overview for UC San Diego FacultyCreative Commons Overview for UC San Diego Faculty
Creative Commons Overview for UC San Diego Faculty
 
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
 

More from nullhandle

Understanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web ArchivesUnderstanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web Archives
nullhandle
 
Unlocking LOCKSS with APIs
Unlocking LOCKSS with APIsUnlocking LOCKSS with APIs
Unlocking LOCKSS with APIs
nullhandle
 
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS ProgramLots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
nullhandle
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Together
nullhandle
 
Measure All the (Web Archiving) Things!
Measure All the (Web Archiving) Things!Measure All the (Web Archiving) Things!
Measure All the (Web Archiving) Things!
nullhandle
 
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
nullhandle
 
Campaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional ResearchCampaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional Research
nullhandle
 
2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights
nullhandle
 
Considerations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection DevelopmentConsiderations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection Development
nullhandle
 
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
nullhandle
 
Link Persistence, Website Persistence
Link Persistence, Website PersistenceLink Persistence, Website Persistence
Link Persistence, Website Persistence
nullhandle
 
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
nullhandle
 
Using Wayback Machine for Research
Using Wayback Machine for ResearchUsing Wayback Machine for Research
Using Wayback Machine for Research
nullhandle
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congress
nullhandle
 
Where We're Going: Non-Traditional Careers for LIS Graduates
Where We're Going: Non-Traditional Careers for LIS GraduatesWhere We're Going: Non-Traditional Careers for LIS Graduates
Where We're Going: Non-Traditional Careers for LIS Graduates
nullhandle
 
Usability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case StudyUsability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case Study
nullhandle
 

More from nullhandle (16)

Understanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web ArchivesUnderstanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web Archives
 
Unlocking LOCKSS with APIs
Unlocking LOCKSS with APIsUnlocking LOCKSS with APIs
Unlocking LOCKSS with APIs
 
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS ProgramLots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Together
 
Measure All the (Web Archiving) Things!
Measure All the (Web Archiving) Things!Measure All the (Web Archiving) Things!
Measure All the (Web Archiving) Things!
 
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
 
Campaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional ResearchCampaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional Research
 
2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights
 
Considerations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection DevelopmentConsiderations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection Development
 
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
 
Link Persistence, Website Persistence
Link Persistence, Website PersistenceLink Persistence, Website Persistence
Link Persistence, Website Persistence
 
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
 
Using Wayback Machine for Research
Using Wayback Machine for ResearchUsing Wayback Machine for Research
Using Wayback Machine for Research
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congress
 
Where We're Going: Non-Traditional Careers for LIS Graduates
Where We're Going: Non-Traditional Careers for LIS GraduatesWhere We're Going: Non-Traditional Careers for LIS Graduates
Where We're Going: Non-Traditional Careers for LIS Graduates
 
Usability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case StudyUsability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case Study
 

Recently uploaded

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 

Recently uploaded (20)

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 

Designing Preservable Websites

  • 1. Designing Preservable Websites Nicholas Taylor @nullhandle DC, VA & MD Search Engine Marketing Meetup July 18, 2012 “found glass” by Flickr user nuanc under CC BY-NC-ND 2.0
  • 2. why preserve the web? copy of the first webpage
  • 3. web archivists aren’t visible stakeholders design archiving usage
  • 4. search engine crawler ≠ archival crawler “GoogleBots” by Flickr user ares64 under CC BY 2.0
  • 5. what is a “preservable” website? “Fish Preserver” by Flickr user ecstaticist under CC BY-NC-SA 2.0
  • 6. three priorities: • capture: can resources be acquired by current web archiving technologies? • replay: can the user’s experience of the original website be recreated from the archived resources? • preservation: how can it be assured that the archived website remains coherent over time?
  • 7. follow web standards and accessibility guidelines “Web Standards Fortune Cookie” by Flickr user mherzber under CC BY-SA 2.0
  • 8. be careful with robots.txt exclusions robots.txt for Last.fm
  • 9. use a site map, transparent links, and contiguous navigation “Card sorting” by Flickr user Manchester Library under CC BY-SA 2.0
  • 10. maintain stable URLs and redirect when necessary “Improvised detour sign” by Flickr user Jason McHuff under CC BY-SA 2.0
  • 11. consider using a Creative Commons license “2500 Creative Commons Licenses” by Flickr user qthomasbower under CC BY-SA 2.0
  • 12. use durable data formats “Lascaux cave painting” by Flickr user qoforchris under CC BY-ND 2.0
  • 13. embed metadata, especially the character encoding source code of http://www.seo.com/
  • 14. use archiving-friendly platform providers and CMSs robots.txt for Drupal 7
  • 15. three tips 1. see how well your site validates on http://validator.w3.org/ 2. see how your site looks on http://archive.org/ 3. your favorite online sitemap generator is a good starting point “Highlighters” by Flickr user KJGarbutt under CC BY-ND 2.0

Editor's Notes

  1. Design decisions have a major effect on website preservability.
  2. “ Benign neglect” may have been sufficient for physical objects; more active interventions needed for digital ones.
  3. Design and usage inform each other; where does web archiving fit?
  4. Because web archivists care about recreating the user experience, they care about all assets being crawled.
  5. Good also for usability and SEO. Web crawlers access sites like a text browser. Replay platform must accommodate coding idiosyncrasies.
  6. CSS and JavaScript directories matter for archiving but perhaps not for search engine indexing.
  7. Crawler can only capture links it sees. User of archived site can only navigate by following links. Avoid relying on Flash, JavaScript, or other technologies that obscure links. Use a site map.
  8. Link rot is common. Web archiving tools are URL-sensitive. Stable/redirect URLs make for seamless archive access.
  9. Copyright law lacks explicit provisions for digital preservation. Many libraries ask for permission to archive websites. Creative Commons provides affirmative permission to be crawled and preserved.
  10. Websites contain many different file types, each with distinct preservation risks. Favor open standards and file formats, except when poorly-documented or where vendor-specific extensions are allowed.
  11. Embedded metadata makes it easier to replay and preserve archived sites.
  12. Platform providers more likely to accommodate commercial search indexers than archival crawlers. If you care about archiving, inquire about policies, examine robots.txt, or look at how website looks in Internet Archive’s Wayback Machine. If you’re using an open source CMS, be sure to review the bundled robots.txt.
  13. While following these recommendations won’t guarantee perfect archiving, not following them will ensure additional challenges.