SlideShare a Scribd company logo
1 of 14
How Databases Learn
Andrea K. Thomer
Michael B. Twidale
Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
5 Mar 2014 – iConference - Berlin
Ceci n’est pas un database.
From ethnography to trace ethnography
 Hine 2006, Bietz & Lee 2009 – traditional ethnographies of
scientific databases
 Schuurman, 2007: “database ethnographies”
 Geiger & Ribes, 2011: document-driven ethnography -> trace
ethnographies
 looks at "how, where, and by whom [documents] are produced,
edited, revised or filed" -- in a database this is much the same but
we ask the same questions of tables and fields
 Study of edits on Wikipedia
 1000’s of agents making 1000’s of changes over hours
 Our study: 1000s of changes by few people over many years;
looking for traces of schema change, reappropriation of fields
We ask: how do databases, like
buildings, learn?
Case study: the Universal Chalcidoidea
Database
Parasitic wasps: “gem-like
inhabitants of the woodlands
heretofore unknown and by
most never seen nor dreamt of.”
– Girard, 1924
How we learned the database
How the
database
learned:
“CRENCYRT”
How the
database
learned: • Additional table
added for
separate
project
• Duplicate, non-
normal tables
as an ad hoc
way to manage
workflow
“REFNEW”
Brand’s concept of shearing
Stuff (days – months)
Space plan (months to years)
Services (years – decades)
Skin (decades)
Structure (decades to
centuries)
Site (eternal)
How do we account for shearing in
databases?
Additional questions:
 What makes a database able to adapt to changing uses and
users?
 How does a database evolve when the people in charge of it
change?
 How are fields repurposed over time?
 How is it that some databases adapt to changing needs and
circumstances better than others?
 How are major renovations (refactorings) best handled?
 How do we work with unchangeable traces of earlier designs?
 How and when are “best practices” not actually for the best?
Conclusions (and more questions)
 For databases, thinking about preservation as a simple binary
between migration and emulation is too simplistic
 They need to evolve, but how?
 Buildings face a similar challenge – can we gain insights from
comparison?
 Questions for you:
 From Mike: Do you know of more examples of databases that
have gone through gradual tweaks or punctuated leaps?
 From me: What else should we be reading?
Thank you!
Refs:
Bietz, M., & Lee, C. (2009). Collaboration in metagenomics: Sequence databases and the organization of
scientific work. ECSCW 2009, (September), 7–11. Retrieved from
http://www.springerlink.com/index/t7124470143464r9.pdf
Brand, S. (1995). How Buildings Learn: What Happens After They’re Built. Penguin Books.
Girault, A.A. (1925). “Some Gem Like Inhabitants of the Woodlands by Most Never Seen Nor Dreamt Of.” The
Literature of Platygastroidea. Retrieved from http://plazi.org:8080/dspace/handle/10199/15794
Geiger, R. S., & Ribes, D. (2011). Trace Ethnography: Following Coordination through Documentary Practices.
2011 44th Hawaii International Conference on System Sciences, 1–10. doi:10.1109/HICSS.2011.455
Hine, C. (2006). Databases as Scientific Instruments and Their Role in the Ordering of Scientific Work. Social
Studies of Science, 36(2), 269–298. doi:10.1177/0306312706054047
Schuurman, N. (2008). Database Ethnographies Using Social Science Methodologies to Enhance Data
Analysis and Interpretation. Geography Compass, 2(5), 1529–1548. doi:10.1111/j.1749-8198.2008.00150.
Acknowledgements: thanks to Katrina Fenlon, Nic Weber and Karen Wickett for feedback;
and thanks to CIRSS for funding
Databases at RLB

More Related Content

Viewers also liked

Fun stuff we did
Fun stuff we didFun stuff we did
Fun stuff we did
Sueford
 
Charity: Water Newsletter
Charity: Water NewsletterCharity: Water Newsletter
Charity: Water Newsletter
FromTheTap
 
Tips For Evaluating Hr Technology
Tips For Evaluating Hr TechnologyTips For Evaluating Hr Technology
Tips For Evaluating Hr Technology
KristyM
 
Respiratory Drugs
Respiratory DrugsRespiratory Drugs
Respiratory Drugs
aslaight
 
Client service
Client serviceClient service
Client service
pberenz
 
Shakespeare
ShakespeareShakespeare
Shakespeare
Alex Wu
 
Una sociedad de la información es aquella en la cual las tecnologías que faci...
Una sociedad de la información es aquella en la cual las tecnologías que faci...Una sociedad de la información es aquella en la cual las tecnologías que faci...
Una sociedad de la información es aquella en la cual las tecnologías que faci...
Roxyy Castro
 
толерантность
толерантностьтолерантность
толерантность
galkinalyudmila
 

Viewers also liked (17)

Fun stuff we did
Fun stuff we didFun stuff we did
Fun stuff we did
 
Mindanao. peace communication
Mindanao. peace communicationMindanao. peace communication
Mindanao. peace communication
 
October 6, 2011
October 6, 2011October 6, 2011
October 6, 2011
 
Rsv111
Rsv111Rsv111
Rsv111
 
Leading Teams
Leading TeamsLeading Teams
Leading Teams
 
Taller de Innovaciones Educativas
Taller de Innovaciones EducativasTaller de Innovaciones Educativas
Taller de Innovaciones Educativas
 
Charity: Water Newsletter
Charity: Water NewsletterCharity: Water Newsletter
Charity: Water Newsletter
 
Tips For Evaluating Hr Technology
Tips For Evaluating Hr TechnologyTips For Evaluating Hr Technology
Tips For Evaluating Hr Technology
 
Sj Kb Am 520 S10 Comp Timeline Pre1900
Sj Kb Am 520 S10 Comp Timeline Pre1900Sj Kb Am 520 S10 Comp Timeline Pre1900
Sj Kb Am 520 S10 Comp Timeline Pre1900
 
Unidad Didáctica: Sentados sobre la muralla_ Tania Orts_ G1
Unidad Didáctica: Sentados sobre la muralla_ Tania Orts_ G1Unidad Didáctica: Sentados sobre la muralla_ Tania Orts_ G1
Unidad Didáctica: Sentados sobre la muralla_ Tania Orts_ G1
 
Respiratory Drugs
Respiratory DrugsRespiratory Drugs
Respiratory Drugs
 
Client service
Client serviceClient service
Client service
 
Shakespeare
ShakespeareShakespeare
Shakespeare
 
Brett Solomon: Transparency - the new norm
Brett Solomon: Transparency - the new normBrett Solomon: Transparency - the new norm
Brett Solomon: Transparency - the new norm
 
Una sociedad de la información es aquella en la cual las tecnologías que faci...
Una sociedad de la información es aquella en la cual las tecnologías que faci...Una sociedad de la información es aquella en la cual las tecnologías que faci...
Una sociedad de la información es aquella en la cual las tecnologías que faci...
 
Week1
Week1Week1
Week1
 
толерантность
толерантностьтолерантность
толерантность
 

Similar to How databases learn - iconference 2014

Pratt Sils Knowledge Organization Fall 2008
Pratt Sils Knowledge Organization Fall 2008Pratt Sils Knowledge Organization Fall 2008
Pratt Sils Knowledge Organization Fall 2008
PrattSILS
 
Pratt SILS Knowledge Organization Fall 2010
Pratt SILS Knowledge Organization Fall 2010Pratt SILS Knowledge Organization Fall 2010
Pratt SILS Knowledge Organization Fall 2010
PrattSILS
 
Mdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collectionsMdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collections
Rafael Alvarado
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011
lljohnston
 

Similar to How databases learn - iconference 2014 (20)

LSC Glasgow 061609
LSC Glasgow 061609LSC Glasgow 061609
LSC Glasgow 061609
 
Wild data: collaborative e-research and university libraries
Wild data: collaborative e-research and university librariesWild data: collaborative e-research and university libraries
Wild data: collaborative e-research and university libraries
 
ACL2008
ACL2008ACL2008
ACL2008
 
Pratt Sils Knowledge Organization Fall 2008
Pratt Sils Knowledge Organization Fall 2008Pratt Sils Knowledge Organization Fall 2008
Pratt Sils Knowledge Organization Fall 2008
 
The Timescapes Archive
The Timescapes ArchiveThe Timescapes Archive
The Timescapes Archive
 
i3 Conference Keynote, Aberdeen
i3 Conference Keynote, Aberdeeni3 Conference Keynote, Aberdeen
i3 Conference Keynote, Aberdeen
 
Pratt SILS Knowledge Organization Fall 2010
Pratt SILS Knowledge Organization Fall 2010Pratt SILS Knowledge Organization Fall 2010
Pratt SILS Knowledge Organization Fall 2010
 
Vks Presentation, Jankowski,15 Jan2009, Websites & Books, Near Final
Vks Presentation, Jankowski,15 Jan2009, Websites & Books, Near FinalVks Presentation, Jankowski,15 Jan2009, Websites & Books, Near Final
Vks Presentation, Jankowski,15 Jan2009, Websites & Books, Near Final
 
State of the Art Informatics for Research Reproducibility, Reliability, and...
 State of the Art  Informatics for Research Reproducibility, Reliability, and... State of the Art  Informatics for Research Reproducibility, Reliability, and...
State of the Art Informatics for Research Reproducibility, Reliability, and...
 
Metadata in the age of data curation and linked data
Metadata in the age of data curation and linked dataMetadata in the age of data curation and linked data
Metadata in the age of data curation and linked data
 
Studying people who can talk back, Meyer 2013 DH at Oxford summer school
Studying people who can talk back, Meyer 2013 DH at Oxford summer schoolStudying people who can talk back, Meyer 2013 DH at Oxford summer school
Studying people who can talk back, Meyer 2013 DH at Oxford summer school
 
“Happiness is…Library Automation:” The Rhetoric of Early Library Automation a...
“Happiness is…Library Automation:” The Rhetoric of Early Library Automation a...“Happiness is…Library Automation:” The Rhetoric of Early Library Automation a...
“Happiness is…Library Automation:” The Rhetoric of Early Library Automation a...
 
Mdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collectionsMdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collections
 
Scits 2014
Scits 2014Scits 2014
Scits 2014
 
Data, Science, Society - Claudio Gutierrez, University of Chile
Data, Science, Society - Claudio Gutierrez, University of ChileData, Science, Society - Claudio Gutierrez, University of Chile
Data, Science, Society - Claudio Gutierrez, University of Chile
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
 
data science in academia and the real world
data science in academia and the real worlddata science in academia and the real world
data science in academia and the real world
 
Memory-making and the emergent archive poster
Memory-making and the emergent archive posterMemory-making and the emergent archive poster
Memory-making and the emergent archive poster
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011
 
Libraries in a data-centered environment
Libraries in a data-centered environmentLibraries in a data-centered environment
Libraries in a data-centered environment
 

Recently uploaded

Recently uploaded (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

How databases learn - iconference 2014

  • 1. How Databases Learn Andrea K. Thomer Michael B. Twidale Graduate School of Library and Information Science University of Illinois at Urbana-Champaign 5 Mar 2014 – iConference - Berlin
  • 2. Ceci n’est pas un database.
  • 3. From ethnography to trace ethnography  Hine 2006, Bietz & Lee 2009 – traditional ethnographies of scientific databases  Schuurman, 2007: “database ethnographies”  Geiger & Ribes, 2011: document-driven ethnography -> trace ethnographies  looks at "how, where, and by whom [documents] are produced, edited, revised or filed" -- in a database this is much the same but we ask the same questions of tables and fields  Study of edits on Wikipedia  1000’s of agents making 1000’s of changes over hours  Our study: 1000s of changes by few people over many years; looking for traces of schema change, reappropriation of fields
  • 4. We ask: how do databases, like buildings, learn?
  • 5. Case study: the Universal Chalcidoidea Database Parasitic wasps: “gem-like inhabitants of the woodlands heretofore unknown and by most never seen nor dreamt of.” – Girard, 1924
  • 6. How we learned the database
  • 8. “CRENCYRT” How the database learned: • Additional table added for separate project • Duplicate, non- normal tables as an ad hoc way to manage workflow “REFNEW”
  • 9. Brand’s concept of shearing Stuff (days – months) Space plan (months to years) Services (years – decades) Skin (decades) Structure (decades to centuries) Site (eternal)
  • 10. How do we account for shearing in databases?
  • 11. Additional questions:  What makes a database able to adapt to changing uses and users?  How does a database evolve when the people in charge of it change?  How are fields repurposed over time?  How is it that some databases adapt to changing needs and circumstances better than others?  How are major renovations (refactorings) best handled?  How do we work with unchangeable traces of earlier designs?  How and when are “best practices” not actually for the best?
  • 12. Conclusions (and more questions)  For databases, thinking about preservation as a simple binary between migration and emulation is too simplistic  They need to evolve, but how?  Buildings face a similar challenge – can we gain insights from comparison?  Questions for you:  From Mike: Do you know of more examples of databases that have gone through gradual tweaks or punctuated leaps?  From me: What else should we be reading?
  • 13. Thank you! Refs: Bietz, M., & Lee, C. (2009). Collaboration in metagenomics: Sequence databases and the organization of scientific work. ECSCW 2009, (September), 7–11. Retrieved from http://www.springerlink.com/index/t7124470143464r9.pdf Brand, S. (1995). How Buildings Learn: What Happens After They’re Built. Penguin Books. Girault, A.A. (1925). “Some Gem Like Inhabitants of the Woodlands by Most Never Seen Nor Dreamt Of.” The Literature of Platygastroidea. Retrieved from http://plazi.org:8080/dspace/handle/10199/15794 Geiger, R. S., & Ribes, D. (2011). Trace Ethnography: Following Coordination through Documentary Practices. 2011 44th Hawaii International Conference on System Sciences, 1–10. doi:10.1109/HICSS.2011.455 Hine, C. (2006). Databases as Scientific Instruments and Their Role in the Ordering of Scientific Work. Social Studies of Science, 36(2), 269–298. doi:10.1177/0306312706054047 Schuurman, N. (2008). Database Ethnographies Using Social Science Methodologies to Enhance Data Analysis and Interpretation. Geography Compass, 2(5), 1529–1548. doi:10.1111/j.1749-8198.2008.00150. Acknowledgements: thanks to Katrina Fenlon, Nic Weber and Karen Wickett for feedback; and thanks to CIRSS for funding

Editor's Notes

  1. Today I’m going to be presenting a case study as part of a work-in-progress looking at how research databases, particularly relational databases, “learn” and change over time. This is something that I was particularly excited to bring to iConference as a note because when I attended last year, I was so impressed by the feedback I saw notes authors getting from their audience. So I’m really hoping to hear from you guys at the end of this talk! In particular, we’re hoping you can help us think up more examples of the objects at the center of our study: long-lived research databases. As the title of this talk hopefully implies, we’re interested in how these databases – particularly relational databases that have been used more or less continuously for more than five years – change, grow and are repurposed over time.
  2. Motivation: But before we get into the examples of long-lived databases that we have found, I want to explain why we’re looking at them in the first place. Tony Hey this morning talked about the move toward “data intensive science” and the computational turn brought about by the “fourth paradigm” of data driven discovery. -- In doing so, he referenced the network of databases that makes up Pub Med Central by using a nice, tidy little diagram showing a bunch of circles linked together by arrows. While tidy little diagrams involving circles and cyllindars look nice on slides and grant proposals, they’re a terrible representation of the reality and messiness of a relational database as it’s maintained by a number of people over a number of years. It’s our opinion that while we as a discipline are very good at abstracting databases into formalisms like normal forms and Entity-relationship diagrams, we need to be better at relating those formalisms to long term use, and furthermore, need to work harder to account for use beyond an initial set of use cases.
  3. Prior work has used ethrnographic methods to document and describe that complex interaction between humans and information infrastructures. However, as Geiger and Ribes point out, traditional ethnographies are simply impossible when you’re hoping to desribe distrubted work In their case study of edits made to wikipediea, they’re talking about geographically distributed but concurrent work. Here, we’re talking about temporally distributed but often spatially or site-constrained work: edits made to a particular database in situated in a laboratory In this work we’re conducting something similar to what they call a trace ethnography, which builds on the tradition of document-driven ethnography, looking at "how, where, and by whom they are produced, edited, revised or filed" -- in a database this is much the same but we ask the following questions of records, and tables and different fields.
  4. In addition to conducting this trace ethnography, we’re exploring the application of concepts outlined by Stewart Brand in his 1995 book, “How Buildings Learn”. In this book, brand describes how buildings “learn” from their owners and inhabitants, and how their structure, skin and space change in response to changing needs. Architecture-as-metaphor isn’t new in software engineering, but we think it can help clarify fuzzy concepts and pheneomena, particularly for database use.
  5. Our case study focuses on the UCD – a database of chalcid wasp names, references and geolocations. Chalcid wasps are tiny but plentiful – there are 22k described but up to ½ a million in existance – and this databases contained some 10k records. Originally built by John Noyes for British NHM, the database was in need of migration – specifically to a larger taxnomic database called “Taxon Works”
  6. I was originally brought to this database as a part time employee of the natural history survye, with the job title of “taxonomic data modeler”. My primary goal was to interpret the UCD’s original creators’ original “schema” and turn it into something more formal so that the records could be migrated. Our ethnography at this point involved looking through the collection of files we were handed and trying to interpret the many partially or poorly described tables. The “Flowchart” on the left is the primary descriptive doucment that we had to work with, as well as a collectino of text files representing the individual tables.
  7. However, in comparing the flowchart to the actual files and tables, we began finding discrepancies. Notably, we found far more tables than were described by the flowchart.
  8. We realized that Noyes had made extensive alterations and edits to his db’s structure after creating his first set of documentation. The files we were given included 34 tables, whereas Noyes’ original schema only contains 22. After consulting with Noyes, we learned that he had “ingested” several other datasets into his over the course of the UCD’s lifespan, and had furthermore, begun using the UCD for local data management of some related-but-separate projects, such as a table titled “crencyrt” which contains data from survey of Costa Rican Encyrtidae ranges otherwise unrelated to the rest of Noyes’ data aggregation efforts. Additionally, Noyes relied on non-expert, unpaid museum volunteers for data entry, so he stringently checked all of their work before “accepting” it into the database. In order to do this Noyes created proxy tables into which volunteers could enter their data. Noyes then would manually migrate these new records to the main set of tables.  
  9. Like I said before, a lot of the the prior work looking at databases in the workplace has been ethnographic, and thus a bit difficult to generalize. And our representations of databases are too static – they don’t reflect change. A combination of a trace ethnography as well as Brand’s framing allows us to do to study the interplay between engineering, a particular lab’s culture, and the day-to-day getting on with it. In the case of the UCD, the changes to this database particularly lend themselves to Brand’s architectural metaphors: tables were “added on” like spare rooms to make room for an expanding “family” of projects. Because Noyes so carefully curated his data, we did not find some of the quirks of long-term use that we have observed in our own prior work with databases, such as gradual change in the use of certain fields over time (the repurposing of a room, in Brand’s rendering), or of shearing of large tables into smaller subsections.
  10. There have been some references to shearing related to software development in general, but nothing too well fleshed out Systems of Record, Systems of Differentiation and Systems of Innovation. We often discuss digital preservation as a straight forward dichotomy between emulation and preservation,
  11. In conclusion: this opens up a lot of research questions like: