Bits of charters: putting Carolingiancharters intoa database
[Slide 2:projectwebsite]The Makingof Charlemag...
Bits of charters


Published on

Text of presentation on data structures in Making of Charlemagne's Europe database, given at at Medieval Studies in the Digital Age seminar, Leeds, February 2015

Bits of charters

  1. 1. 1 Bits of charters: putting Carolingiancharters intoa database INTRODUCTION [Slide 2:projectwebsite]The Makingof Charlemagne’sEurope project ranfrom2012-2014 at King’s College Londonand createdadatabase frameworkforthe storage and retrieval of prosopographical, geographical andsocio-economicdatafrom earlymedieval charters.The projectteamalsoinput data fromnearly1000 chartersintothe database systemtoproduce a corpusthat couldaddressa wide range of researchqueries. If anyone wouldlike a demonstration of the database,Icangive one brieflylater. Thistalk isfocusing on the decisionprocesses behindthe scenes of the project, discussingsome aspectsof the database designandthe reasonsforthem. How do youdecide the technological formandscope of a project? How can typesof data such as places be enteredconsistentlyandretrievedeffectively?How doyou thenlinktogetherbasicunitsof people,placesandpossessionstoprovide useful representationsof activitiesinacharter?And finally,havingputall the dataintoa database, how doyouget it out again?What I’m tryingto showissome of the practical problems involvedintakingunstructured and oftenambiguousmedieval informationandputtingitindigital formats. But I wantto start withthe most basicquestion:why thisproject?Digital humanitiesreflectsthe interactionof twodifferentfields:fora goodprojectyouneeda balance betweenthe technological developments thatmake new digital approachesfeasibleandthe researchquestionsthat scholars wantnew insightsinto.Andthere’salsoathirdaspect,whichshouldn’tbe overlooked:whatyoucan getfundingfor. One of the bigproblemswith alotof digital humanities projectsisthat’sithardto getthe long-termfundingtoallowthemtoshow theirfull potential. The Making of Charlemagne’sEurope projectessentiallyarose fromthe coincidence of twoaspects. [Slide 3:previousDDHprojects]One wasthatthere isa longtraditionof creating prosopographic databasesat King’s;the Departmentof Digital Humanitieshad more than20 yearsof expertisein that. [Slide 4:MKCHEUR spreadsheet]The otherwasthe large numberof Carolingiancharterswhichexist, whose prospographical and socio-economicdatahasn’tbeenfullyexploited.Itmade obvioussense to combine these two factorsandalsoto take advantage of developmentsinmappingsoftware to representspatial informationinnewways. CHARTERS AND THEIR REPRESENTATION I’ve mentioned chartersseveral times,butwhatexactly are they?The shortanswerismedievallegal documents,mostlyconcernedwiththe grantingorsellingof property.I’vegivenyouahandout whichisthe textand translationof one quite longcharter.I’mgoingto be referringtothat a lotin my examples. But a longeransweristhat there are differentwaysof understandingwhatacharter isand they logically resultindifferentkindsof digital projects. [Slide5:understanding] Thereare three main waysyou can thinkof a charter: as a material object,asatextor as a source of information.
  2. 2. 2 Firstly,youcan thinkof a charter as a material object:aparticularpiece of parchmentwithwords and signswrittenonit. [Slide 6:Marburg] A digital projectthinkingof chartersinthat way,like the Lichtbildarchiv ältererOriginalurkunden includes high-qualitydigital imagesof the charters, combinedwithtoolstoallowthe analysisof these images.A similarapproachisbeingtakenby the newModelsof AuthorityprojectbeingjointlydevelopedbyKing’sCollege andGlasgow.1 The problemforsuch an approachwithCarolingianchartersisthatwe don’thave the original manuscriptforthe vast majorityof them;insteadwe onlyhave copiesfromlaterinthe Middle Ages (or sometimes evenjustearlymoderntranscripts). Anapproach focusingonmaterial objects would therefore meanwe were unabletouse a lotof charters fromthe corpus. A second viewof charterstreatsthemas texts. [Slide 7:CDLM] One optionforour projectwould have beentoset upa database containingthe full textsof charters, togetherwithsome kindof XML mark-upto allowusto pickout wordsand termsof particularinterest:personal names,titles,verbs indicatingdonationetc.Thisisthe approachtakenbyprojectssuch as Codice diplomaticodellaLombardiamedievale. [Slide 8:CEI]There’salreadyan international group workingonstandardsfor encodingcharters The difficulty if we wentdownthatroute wouldhave been the datawe workedwith.Carolingian chartershave relativelylittlestructure tothem,theirspellingcanbe idiosyncratictothe pointof unintelligibility andtheycanalsobe maddeninglyindirectinreferringtopeopleandplaces. [Slide 9: Texton screen] Forexample, the charterI’ve givenyoureferstohow Tassilo ‘hadcauseda monasterytobe newlybuiltinhonourof the HolySaviourwithinourforestatthe place called Kremsmünsterinthe pagusof Traungau’(inmonasteriumhonore sancti Salvatorisinfrawaldo nostroloco qui diciturChremisainpagonuncupante Drungaoe novoopere construerefecisset). What bitof that Latintextdo youmark up as the ‘name’of Kremsmünstermonastery?[Slide10: referencestoFater] AndalthoughFateris the abbotof Kremsmünster,thisfactisonlyever mentionedindirectly Sohowdoyoumark up the textto show that Fater’snotjustany oldabbot, but specificallythe abbotof the monastery receivingthe donation? Recentscholarshiponchartershastendedtofavourthe textual approachtocharters:it’soften focusedonindividualchartersorat the widestregional Urkundenlandschaften (charterlandscapes). But we didn’twantto treatcharters as special snowflakes:we wantedtofindawayto lookacross the whole corpusfromCataloniatoAustriaand compare socio-economicdatafromacross different regions.Butmark-upof textswouldn’t allow us todothiseasily:itsstructuresaren’tsuitedto findingall the unfree peoplementioned underhalf adozendifferentLatinterms orfor findingall female vendors. To answerthose kindof questions,ourprojectwas thereforethinkingof chartersina thirdway:as sourcesfromwhichinformation couldbe extractedand putintostandardisedformatsfor comparisonandlarge-scale analysis. Andourtoolsforthatwere a large-scale relational database combinedwiththe ‘factoidmodel’. 1
  3. 3. 3 [Slide 11:definition] Sowhat’sthe factoidmodel?Itwas developedbyKing’sforprevious prosopographical projects. Andatitsheart isthe factoid, an assertion made bythe projectteamthat a source says somethingaboutapersonor a place. Such a model canbe appliedtoanythingfroma saint'slife (the VitaCuthberti states“Cuthberthealed awomanfroma demon”) toa coin (FitzwilliamMuseumcoin839 states "Charlemagne iskingof the Franks"). How didwe apply the conceptof factoidstoour specific project?We startedoff withsome main modelsforbasicentitiessuchasagents(peopleorinstitutions),placesandsources;these contain staticinformationthatdoesn’tchange betweencharters(informationsuchasa person’s sex orthe geo-locationof aparticularplace).We thenlinkedthese modelstogetherviafactoids.Here’swhat thisfactoidmodel lookslike inthe specificcase of our Charlemagne database[Slide 12:Factoid diagram] So factoidsare statementswhichlinktogether agents,placesandpossessionsderivedfroma specificcharter.These factoidsare of varying types,dependingonthe contentthattheyneedto contain.For example, youneeddifferent datastructures toinputthe statement"Faterisabbotof Kremsmünster"(whatwe call anattribute andrelationshipfactoid) thantoinputthe statement "FaterpetitionedCharlemagne to confirmthe propertyof the monasteryof Kremsmünster"(what we call a transactionfactoid).Authoritylists,meanwhile, are usedtoensure thatdata isentered consistently,e.g.that Faterisnot describedas"abbot"inone factoidand"abbas" inanother. Thisfactoidmethodpotentiallyallowed ustoinputa wide range of data intoa single database. We still hadto determine alotof bigissues,however.What specificdatafromeach charterdidwe want to record?What structuresand fields wouldwe needforthis dataandhow didwe ensure datawas enteredconsistently?Andhowdidwe then allow enduserstoextractthe data? A lotof the firstyearof the projectwasspent iteratively developingthe database andthe inputting standards, ratherthan beingable tocarry out much sustaininputting. We startedinthe mostbasic way:just lookinghardat the data. [Slide 13] Discussinghow we dealtwith people whowerebeing transacted,[Slide 14] collatingexamplesof termsusedtoformthe basisfor authoritylists.The charter I’ve givenyouasa handout, DKAR1:169 was one that we wentthrough several timesin considerable detail,because ithadthe kindof complexitythatwe neededtobe able tohandle withinthe system. Ithas18 differentplacesinit:Iknow because we analysedeverysingle one of them.It alsocontainedalotof complicatedinteractionsbetween differentagents andplaces;[Slide 15: diagram] here’sone of my early attemptsat drawinga diagramof that.I’ll come back to that diagramlateron. We were alsothinkingabout the questions usersmightwanttoask [Slide 16:WendyDaviesand Mark Mersiowsky]2 ,consideringthe differentresearchtopics thathistorians hadalready explored usingcharters.[Slide 17:questionslist] We endedupwithalonglistof possible questions thatthe 2 Wendy Davies,Small worlds: the village community in early medieval Brittany (London, 1988); Mark Mersiowsky,'Y-a-t-il une influencedes actes royaux sur les actes privés du IXe siècle?',in Marie-José• Gasse- Grandjean and Benoît-Michel Tock (eds.), Les actes comme expression du pouvoir au haut Moyen âge: actes de la table ronde de Nancy, 26-27 novembre 1999,Atelier de recherches sur les textes médiévaux, 5 (Turnhout, 2003),pp.139-78
  4. 4. 4 database mightbe usedto answer.Lookingatthemnow at the endof the project, mostof themare too complicatedtoanswerwiththe currentversionof the userinterface.But it’sbythinkingof key researchquestions inthatwaythat we couldworkout which data to record. For example, we developedstructurestorecordthe relationshipsbetweenplaces, anoption whichwasn’tavailablein previousprojects.I’lltalkabout those abitlater. Anotherpartof developingthe projectwasdeciding itslimits:some of the issuesthatwe wouldn’t realistically be able totackle. The single mostfrequentcomplaintwe have hadconcerningthe database isthat we don’tinclude the full textof the charters(althoughwe hope inthe future to include linksto the full textavailablefromotheronlinesites).Butgettingsuchdataintothe system consistently wouldhave requiredavastamountof extrawork.I’ve givenyouthe textof DKAR 1:169 on the handout.Here’s the firstsentence of atextof a differenteditionof the same charter [Slide 18: Kremsmünstercharter].Thissingle sentence has 6differences fromthe textI’ve givenyou[click]. If we’dusedprojecttime toinputthe full textof databaseswe probablywouldn’thave ahdtime to do anythingelse. DATA STRUCTURES AND STANDARDS There are alwaysthingsyou’re notable to achieve withinone project.Butevenwithoutthe full text, whydiddevelopingdatastructuresandstandardsfor the Charlemagne database take solong? What I want todo now is to lookinsome detail ata couple of examplesof the datastructureswe created. Everydatabase projecthasto do thiskindof work,so althoughsome of the detailsI’ll be discussingare specifictoearlymedievalcharters,Ihope you’ll findalotof the conceptsare more widelyrelevant.Ourunderlyingrelational databaseishorrendouslycomplex.The schemaforitwent fromthis[slide 19] to this[slide20] andit’snow even bigger. Butbecause it’sarelational database, you’re essentiallybuildingupthiscomplexityfromalarge number of relativelysmall buildingblocks. I’mgoingto be talkingabouttwoof these:placesandtransactionfactoids,usingexamples mainly fromthe charterI’ve givenyou onthe handout. Places Thinkingaboutplaces,twomainaspectsdrove the structureswe developedfor recordingthem.One was technologicaldevelopmentsindigitalmapping;we wantedtobe able torepresentdata patternson a map onscreen.The otherwasthe data itself. [Slide21] In charters,whatwe get isn’t technicallyplaces,butplace names.How yougetfroma name ina medievalchartertoa spot on the map turnsout to require alot of thought. The firstdistinctionwe made inthe database wasbetweencharter-specificinformation onplace names(whichwe recordedin individualfactoids)andstaticinformationonaplace (whichwas recordedinthe place model). [Slide 22] Forexample, the charterI’ve givenyou,DKAR1:169 has two differentmanuscripts andtheirspellingof places varies. The place name editedas‘Bettinbah’ hasa variantspellinginfootnote11 of ‘Petenpach’. There’sanothercharterrecordingTassilo’soriginal grant whichcallsthe same place ‘Petinpach’.Insome chartersyou evengetseveral different spellingsof aname (especiallypersonalnames) within asingle text.Andthere’salsothe complicationof Latinbeinganinflectedlanguage,soyoucan get namesin differentcases,likethe ‘actum Wormacie’inthe lastline of the charter.
  5. 5. 5 One of ourearlydecisions therefore wasthatwe’dincludefieldstorecordthe original textof place namesand thatthese original textboxes wouldbe searchable. [Slide23:place name search]. If you put ‘Wormacie’intothe searchplace namesfunctionsof the database,you’llcorrectlyfind‘Worms’, be toldthat it’sin Germanyand evenhave itlocatedona map foryou that,so that you have what youmightcall a ‘bottom-up’search.If you’ve gotjustaname in a medieval charteryou’ve gota possible wayof identifyingit. But behindthatlinkingfromaname instance toa map, there’salot of work goingonin termsof identificationandstandardisation. I’ll show this schematicallywith anotherplace inthe same charter: the place calledLiublinbah(towardsthe bottomof the firstpage of the handout).[Slide 24: charter-specificinformation]We startoff withwhatthe chartertellsus:Liublinbahisa‘locus’,itsrole inthe charteris as the locationof a possessionbeingdonatedandit’sinthe pagusof Drungaoor Traungau (pagiare Carolingianadministrative regions,similarto Anglo-Saxoncounties). The firstthingwe have to do is produce a standardisedmedieval name forthisplace.Ourmain reference source forsuch LatinnamesisOrbisLatinus[slide 25],but like the majorityof minor placesinour charter,Liublinbahdoesn’tappear inthat.[Slide 26:standardmedieval name] So insteadwe choose one of the spellingsanduse that(addinganasterisktoshow that it may need furtherworkat some point). The nextquestioniswhere thisplace is.We relyonthe editorsof the charterfor this;we don’ttry and researchdetailsforourselves,since we simplydon’thave time. [slide 27:place identification] In thiscase,the editorsaysthatthe settlementconcernedisLeombach,a locationinAustriaandhis identificationiscertain. [slide 28] Once we have thisidentification,we can use modernreference toolsto check the geo-locationforLeombachandwhatpart of Austriait’sin.[slide 29] Fromthis,we can create a place recordfor Leombach.[slide 30:map fromDKAR1:169] Whenwe enterdata concerningthe charteritself we thenincludecharter-specificinformationsuchas the place’s role or the place descriptor. That may all seemratherlongwinded, butthiswayof thinkingaboutplacesgivesusa lotof flexibility for dealingwithmore complex cases. Forexample,we quiteoftengetriversormountains mentionedincharters;[slide 31:Ipfbach]inDKAR1:169 there are the two riverscalledIpfbach (Ipphas inthe original).We can’teasilygeo-locate rivers,butwe caninputthe name data we do have intothe systemandproduce recordsfor natural featuresinthatway The more difficultsituationiswhenthere’suncertaintyaboutwhat place the medieval name refers to. Sometimes,the editorwill justgive anapproximate area.[Slide 32:Raotola]. The editorof DKAR 1:169 thinks Raotola,where there are several vineyards, issomewhere onthe Rodelbach,atributary of the Danube.We inputthe place as having a medieval place name,butnomodernone,butan approximate locationmeanswe canrecordit withinmodernplace hierarchies:it’ssomewhere withinUpperAustria Alternativelythe editormaysuggest one ormore possible modernlocationsforaparticular medieval place. Forexample,acharterfromMondsee discussesadonationof propertyatan unknownplace called Teginga.[Slide33:schematic] Here iswhere we effectively carve upour
  6. 6. 6 earlierschematicintotwo.We still have amedievalplace onone side, butitisnow beingtentatively matchedto three differentmodernplaces,withvaryinglevelsof probability. [Slide34] Byrecording possible matchesinthisway,we caneventuallygenerate adisplayforthe usersthatshowsthe possible options. The final aspectof places I want to talkaboutis place relationshipfactoids.One of the thingswe realisedearlieronwhenlookingatourplace data is that we had twodifferentsortsof place hierarchy,where placesare inlargerunits.Firstly,we hadthe modernhierarchies:Leombachisin the Austrianstate UpperAustria,whichisinthe moderncountryAustria.Thatwas easyenoughto record.But what didwe do aboutthe otherinformationwe’re beinggiven,thatLeombach isinthe pagusof Traungau? How do we recordmedieval hierarchies? PreviousprojectsatKing’shave beenable tofudge thisandmeldtogethermodernandmedieval hierarchiesbecause they’ve beendealingwithaBritishadministrative systemthat’sextraordinarily long-lasting. Forexample, the Prosopographyof Anglo-SaxonEnglandusedthe pre-1974English countiesfortheirhierarchies,whichare nearenoughtoAnglo-Saxoncountiestobe workable.But we didn’tknowwhere Carolingian pagiwere onthe map.Infact, there’sevenascholarlyargument aboutwhetherpagiwere flatareason the ground at all or just scatteredcollectionsof administrativerights.So torecordthe factthat Leombachwas inthe pagusof Traungau, which mighthelpresearchersto understand more aboutpagi,we neededsome additionalstructures. What we endedupusingiswhat’scalleda place relationshipfactoid. [Slide 35] Thisisa charter specificassertionthat“CharterXsays Place 1 isin Place 2”, and alsoincludesplace descriptorsfor the two placesconcerned.The original ideawasthata userwouldbe able to pull upa place record for a medieval regionand thenbe able tosee all the placessaidto be withinin(includinggeo- locationswhere they’reknown).We weren’table toimplementthisfully, butevensothese factoids still provide useful information.Andtheyalsoillustrate anotherimportantaspectof any digital humanitiesproject.If youthink apiece of data mightbe useful,it’smuchbetterto start recordingit earlyon, rather thanhavingto go back laterand rechecka large numberof records. Transaction factoids As youcan see,justdesigningthe buildingblocksforthe database system, like places,takesalotof thoughtif you’re goingtobe able to recordthemconsistently. Ourtrickiestproblem, however,was workingouthowto record transactions,the actual businessof the charters,withinthe database. Andit’sdifficultbecausewe’re tryingtorecord dynamicratherthan juststatic information. If you lookat this place relationshipfactoid,forexample,the statement‘charterXsays Leombach isin Traungau’doesn’talterthroughoutthe charter. Similarly,whenDKAR1:169 refersto ‘AbbotFater’, whichwe’drecordas the attribute andrelationshipfactoid ‘Faterisabbotof Kremsmünster,that’s a fixedstatement.There maybe earlierandlaterchartersinwhichFaterisn’tabbotof Kremsmünster, but inthisparticularcharter he alwaysis. In contrast,the mainimportance of a charter is thatit’schangingthings,typicallysomeone granting propertyto someone else.Thingsare differentafterthe actionsdescribedinthe charterthanthey were before.Buthowdoyou recordsuch a change in a relational database structure thatdoesn’t allowfordifferentstates?
  7. 7. 7 The firstthingwe did was simplifythingsby breakingdownthe activitiesin anycharterintoa collectionof differenttransactions (possessionsflowingaround) andevents(allthe otherthings goingon).[Slide 36] I showedyou thisdiagramforthe activitiesinDKAR1:169 earlieron. It’svery complex because itshowsyoualmostall the activities (thoughinfactthere are few more eventhan that).But if we start breakingthese activities down,we cangetrather more manageable units. Firstly,there are three differentevents. [Slide 37] Tassilofoundsthe monasteryof Kremsmünster and [Slide 38] there are two examplesof landclearance.[Slide 39] Activitieslike theseare recorded ineventfactoids,whichgive fairlybasicinformationaboutagents,placesandthe type of activity goingon. Once we’ve dealtwiththose, we’re leftwiththe transactioninformation[Slide 40].Whatwe have in thisdiagramis three separate transactions. [Slide41] One isa recordof whathappened inthe past: Tassilograntedpossessions toKremsmünster.[Slide 42]The othertwoare Charlemagne’sactionsin the present.He confirmsKremsmünster’spossessionsandhe alsograntsto the menof Eberstalzell the right to remainonlandthey’ve cleared illegally We needtobreakdown the charter intothese separate transactionstoallow us torepresentthe flowsof possessions inadatabase structure.[Slide 43] Sofor example,Tassilo’sinitial grant to Kremsmünstercanbe representedinaseries of tablesthatshowsthe agentsinvolved,thenthe detailsof the placesmentioned,thenthe possessiontransferredandsoon We’ve frozenthe activity ina waythat allowsusto describe itwithinarelational database. Thisis a simplifiedversionof the structure we use torecord transactions,butit still givesusquite a lotof flexibility.[Slide44: multipledonations] Forexample,itmeansthatwe can record a numberof donatedpossessionsinthe same record;we don’tneed 22 differentfactoidsforthe 22 different thingsTassilogave toKremsmünster,whichisabigrelief.Andif there are different locationsor termsand conditionsfordifferentpossessionswe canrecordthose inone go. But if we’re goingtouse a data structure like this,we needtomake sure thatwe can interpret everythingunambiguouslyfromthe informationwe recordinthe table.[Slide 45] If you try andput bothCharlemagne’sconfirmationtoKremsmünsterandhisgrantto the menof Eberstalzell inthe same record,how doyou keeptrackof who’sgettingwhatpossessions?Yourapidlygeta complete mess. Sowe had to define atransactionas involving onlytwomainagents(oragentsworking together) andonlyone type of activity,sowe didn’tcombine aconfirmationandagrant inthe same record. We still had furtherdifficultiestosolve. One issuewasthatthe neatdistinctionI’ve beenmakingall alongbetweenagents,placesandpossessionsdoesn’t actuallyworkwhenyoulook closely atthe data. [Slide 46] For example, lookatthe 22 possessionsthatDKAR1:169 mentions. Asyou’ll see, amongthe thingsbeinggranted [click] are entitiesthatwe regardas agents:churches,forexample, like thatat Alburg,butalsopeople beinggranted,like the craftsmeninRaotola.Andaswell asland at particularplaces beinggiven,there’salsoawhole place beinggiven: the villaof Alkoven [click]. We hadto ensure thatwe couldrecordagentsand placesas possessions;we alsostillhadtorecord
  8. 8. 8 themas agentsand placesinthe normal way.People don’tcease tobe people justbecause the Duke of Bavariatreatsthemas objectsforsome purposes. [Slide 47] As youmay alsohave noticed,possessionsno16 and 18 on thislisthave an additional complication:there’smore thanone objectinvolved. Tassilo’sgiving2vineyardsatAschachand 3 at Raoltola,sowe alsoneededtorecordquantitiesof objectsbeingtransferred. A final issue wasthatnotall the transactionswe were interestedinhadidentical flowsof possessions. [Slide 48:sale] Ina sale,there are possessionsgoingbothways:how doyourecord that? [click] Ourapproachwas to add anothercolumntothe table for possessions:if the returnbox ismarked,it meansthatthese particularpossessionsare flowinginthe oppositedirection.[Slide 49] In the online versionof the database, we use anarrow to highlightthis. Unfortunately,however, DKAR1:169 hasan evenmore complicatedtwist.[Slide50: transaction]Let’sgobackto Charlemagne’sinteractionwiththe menof Eberstalzell. Charlemagne grants to these men the rightto remainonlandthey’ve cleared,butinreturntheyhave todo service notto him,butthe monasteryof Kremsmünster.Thisisnolongeranarrangementbetween twoparties:a thirdparty is nowinvolved. [Slide 51:diagram] Aftera lotof discussion,we eventually developedadata structure inwhichwe couldrecord sucharrangements,whichwe calledthird-partyreturns. Essentiallythiswasavariantof the methodwe’dalreadyusedforordinaryreturns;we just neededtoadd anothertick-box tothe inputform. But thishighlights akeyissue whenyou’redesigningdatastructures:there’satrade-off between how accuratelyyoucan representyour dataand how complex yourdatastructuresneedtobe. It’s alwaysa temptationto developadatabase thatcan deal withevery possible datavariant,butyou endup bymakingit more and more complicated anddifficulttouse.Asitwas,we had inputscreens where youhadto scroll across horizontallytosee all the fieldsyouhadtoinput. [Slide 52:shoes] Eventually therecomesapointwhere youhave to decide you’re notgoingto change your data structuresanymore; youjusthave to fitthe data, howeverimperfectly,intothe existingstructuresThe problemisknowingwhenyou’ve got tothat point.Doyou designelaborate data structuresforthe minorityof casesthatare as complex asthe charter I’ve givenyou? The problemcomesinknowing howcomplexthe average charterisbefore you’velookedatitindetail. Third-partyflowsturnedouttoappearonlyinabout 3% of charters; possiblywe couldhave dealt withtheminsome otherway,but at the time thisseemedthe mosteffectiveapproach. More generally,whendesigningstructuresandstandardsit’sveryeasytomake decisionsthatyou laterrealise were wrong.Forexample,we didn’tinitiallytreatall churchesasagents,buttriedto distinguishbetweenmore andlessimportantchurches.Towardsthe endof the project,we realised thisapproach wasn’tworkingandhadto go back and re-inputsome of the data. So developingdatastructuresandstandardsforthe Charlemagne project wasanodd mixture.We had to combine detailedconceptual analysisof platonicideal of names,charters,transactionsand
  9. 9. 9 the like withthe messyreality of the actual formsthatsuch thingstake inpractice.JohnBradley and Michele Pasin, membersof ourdigital humanitiesteam, once wrote apaperentitled ‘Structuring that whichcannotbe structured’ 3 andin a sense that’swhatwe’ve beentryingtodo. Butit’sonlyif youcan inputdataintothe database ina way that’sbothstructuredand that retainsasmuch of its meaningthatyoucan getit outagain ina useful form.That’swhatI want to discussbrieflyinthe final partof thistalk. GETTING DATA OUT In some waysgettingdataout isharderthan gettingitin.Just sortingoutthe displays of factoidsso that theymake sense toend-usersis time-consumingandthere are still aspectsof thatthat coulddo withbeingimproved.Butthe biggestproblemwhenwe were designingthe userinterface was providinganeffective wayforuserstobrowse the database andfindthe particularinformation there were lookingfor. The main methodwe usedwas‘facetedbrowsing’,whichisincreasinglyusedbymanydatabases.To explainhowthatworks,I’ll showyouaverysimple example,notusingcharters.[Slide53] Suppose youhave a database thatcontainsinformationabout colouredshapes.How doyoufindthe object withthe particularshape andcolour youwant? The traditional wayiswitha search box [Slide 54],butthere are several problemswiththis.The first isnot knowingthe righttermsto use. [Slide 55] Forexample,youinputthe term‘square’andget zeroresults. Whyisthat? Because the shapesinthe database aren’tactuallysquares, evenif they looklike it, butrectangles,sothey’re all listedunderthat term.Similarly,supposeyou’re interested inparticularsorts of triangle. [slide56] You inputthe pair of terms‘red’and ‘triangle’.Againyouget zeroresults. Butwhyis that?Doesthe database notcontaintriangles?Orhas itclassifiedthemall undersome differentterm?Perhapsitthinksthatcolourisn’tredbutscarlet? Searchingadatabase you’re notfamiliarwithcanbe worryinglylike makingrepeated stabsinthe dark. Facetedbrowsing,incontrast,providesaneasierwaytonarrow downyoursearch till youfind exactlywhatyou’re lookingfor. [Slide 57] Inthiscase,you mightbe givena choice of twofiltersto browse by:by shape or bycolour.Choose one of these [click] andyouthengetshownhow many examplesthere are of each colour[click].It’smuchsimplertofindall the red objectsyouwant[Slide 58]. But you’re notinterestedinall red objects,justredtriangles.Here’swhere youcancombine filters. [Slide 59] Choose the shape filter,andyou’re shown whatshapesthe redobjectshave Sothere definitelyaren’tanyredtrianglesinthe database,it’snotjustthatyou’ve gotyour searchterms wrong.Andif you wantto, youcan nowclearthe colour filteranduse the shape filtertolookfor trianglesof anycolour[Slide 60]. Facetedbrowsingtherefore,offersusersaneasywayof narrowingdownentriesinadatabase to findexactlywhattheywant.You’re alwayskeptaware of whatfacetsyou’ve usedalreadyandyou 3 Bradley,John, and Pasin,Michele,'Structuring that which cannot be structured: a role for formal models in representing aspects of Medieval Scotland',in Matthew Hammond (ed.), New perspectives on medieval Scotland, 1093-1286,Woodbridge: The Boydell Press,2013,pp. 203-14
  10. 10. 10 can remove some of themif youfindyou’re notgettinganyresults.The same basicprinciplesas withthe colouredshapesunderlie the muchmore complicatedfacetedbrowsinginourdatabase. You can browse bycharter, agentor place and graduallydrill downtofindwhatyouwant [Slide 61– 3 clicks] Facetingmakesthingsalot easierforusers,butitdoesn’tsolve all theirsearchproblems.Users of our database can browse byeithercharters,agentsorplaces,buttheyhave to thinkabout the meaningof the filterstheyuse inthese differentviewstogetthe resultstheyexpect.Suppose you’re browsingbycharters.Youcan choose tofilterthe chartersby variouscharacteristicsof the agentstheycontain,soyou couldchoose all those thatinclude scribes[Slide 62] andthenadd in women asan additional filter[Slide 63]. Youendup with177 charters,but that doesn’tmeanthat there are 177 charterswrittenbyfemale slides,butthatthere are 177 whichinclude ascribe of some sex andalsoa womaninsome role. [Slide 64] Infact, if yousearch viaagents,youfindthat so far we haven’tfoundany scribeswhoare definitelyfemale. In addition,facetedbrowsingwitha database like ours(unlike withthe colouredshapesexample) needsalot of behind-the-sceneswork sothatwhenusersclickona filtertheygetthe resultsthey expect.Todemonstrate that,Iwant to talkaboutone of the most complicatedfilterswe hadto develop,thatlinkingagentsandplaces.If youwant tofindall the agents connected withthe place Worms,for example,howdoyou definethisconnection inameaningful way? A simple-mindedrule wouldsayall thatall agents whoappear ina charter whichmentions place X are connected withplace X.[Slide 65]. Butas you’ll see fromthe exampleof DKAR1:169 youget some unsatisfactory results. The groupof Slavssomewhereoutinthe wildsof Austria,forexample, may well nothave knownanythingaboutWorms.Doesitmake sense to connectthemto it?Equally, if we happenedtoknowthe scribe of thischarter (we don’t),he’dbe sittinginWormsandthe first he may have heardof Leombachiswhenhe’saskedtowrite a charter mentioningit.AsforTassilo, by 791 when thischarter waswritten,he’dbeendeposedbyCharlemagne andwasbeingheldin a monasteryinNormandy. ConnectinghimtoWormsbecause a charter he’donce givenwas confirmedthere seemstenuousatbest. What we had to do therefore waslookathow agentsare connectedtoplacesinthe charter bythe rolesthey play.[Slide 66] Youimmediatelygetamuchsmallerbutmore meaningful setof connections.So,forexample,Tassiloasagranter isconnectedtothe differentplaceshe donated and so is Kremsmünsterasthe recipient.Butthe monastery isalsoconnectedtoWorms,because it (or at leastFaderrepresentingit) wenttoWormsto geta confirmationcharter fromCharlemagne. The beekeepersof Raotola,meanwhile,are connectedto Raotola,butnotto anyof the otherplaces mentionedinthe charter. We don’tknow if theyeverwenttosome of the other properties of Kremsmünsterthatare mentioned.Andwe don’tknow exactly where the decaniaof Slavswere,so theydon’tgetconnectedto anywhere. We obviouslycouldn’tdothissortof detailedanalysisof agentandplace interconnectionsforevery charter. [Slide 67] Instead,we hadto come upwith rules(still verycomplicated) forhow agentroles and place rolesare connectedtogether. Forexample,anyone whose agentrole makesthem responsible forthe flowsof possessions(like grantersandrecipients) shouldbe linkedtoall places
  11. 11. 11 whichhave the place role ‘locationof possession’.Inmostcases(whichwe specify) theyshouldalso be linkedtoanyplace withthe place role ‘locationof transaction’.Anyone whoseagentrole just involvesthembeingpresentata transaction (like apetitioner) shouldbe linkedonlytothe place withrole ‘locationof transaction’. The full rules justforthisfacetprobably tookus a monthor more of discussiontoworkout. [Slide 68] Here’sanextractfrom the final results.Thisdocumentgivesthe rules,butitalsoincludesthe testdata: examplesthatwe coulduse tocheckthe facetswere workingaswe wanted.Doingthis testingwasincrediblypernicketyandtediousforboththe historiansandITspecialists [Slide69].But it wasthe onlyway of checkingthatuserswouldalwaysgetthe expectedresults. Facetedbrowsing isa verypowerful tool forusers,butit’snoteasyto setup a database to be able to use it. CONCLUSIONS(5 min) In thistalkI’ve triedtogive a feel forthe practicalitiesof a medievaldatabase projectandtoshow somethingof the messinessbehindthe neatfacade.Iwanttoendwith five more general points aboutcombiningdigital technologyandhistorical texts. [Slide 70] [click] One of the mostbasic problemswe hadwiththe projectdidn’tinvolve the digital aspectat all.It was simply understandingwhatsome chartersactuallymeant.Whatisthe decania of Slavsthat DKAR1:169 refersto?We decideditwasprobably some kindof administrativeunit,but the charter’sirritatinglyvague andwe’re stillnotquite sure. [click] A secondissue ishowto avoidre-inventingthe wheel withdigital historyprojects.We didour bestto buildonwhatpreviousprojectshaddone,butit’s sometimes surprisinglydifficulttofindout specifictechnical detailsof otherprojects.We’re thereforedoingourbesttopreserve and documentthe knowledge we’ve gained,throughthe website’sblogand alsothroughpresentations like this. [click] One of the otherreasonsprojectstendtoendup re-inventingthe wheel isthatdifferent historical periods produce notjustdifferenttypesof document,butdifferentstyles.The Peopleof Medieval Scotlandproject,forexample wasapreviousprojectbasedoncharters,butthese are far more standardised documents inthe twelfthcenturythaninthe eighth.Changingsocial practices alsomakesstandardpracticesacross databases verydifficulttoimplement.POMS, dealingwith Scotland inthe central Middle Ages,foundituseful torecordGaelicandLatin namesseparately, whichwasn’tan issue forus.On the otherhand,about half the people inmedievalScotland seemto have beencalled eitherJohn,WilliamorRobert,sothe researchersonPOMS didn’thave tospend so much time arguingaboutwhetherAdalbertreallyisthe same name asOdalpert. [click] Fourthly,althoughI’ve focusedondatastructuresin thistalk, data inputstandardsare also veryimportantforany historical projectandthey dohave to be specifiedin incredible detail.Touse an analogy, youcan’t have one inputterclassifyingshapesasrectangleswhile anothersees themas squares.Eventhe mostbasic data tobe inputhasto be agreed. Evenso, problemsof inconsistentinputtingovertime are inevitable,evenif it’sjustone person doingthat.We didour bestto discussanddocumentthe decisionswe made,usingwiki software
  12. 12. 12 and a lotof Skype meetings,butwe still hadtodo a lotof data clean-uptowardsthe endof the project. [click] Whichleadsme tomy final pointabouthistorical databases.They’re inevitablyimperfect. It doesn’thave tobe quite at the level of garbage in,garbage out,but behindthe cleanfacade of any database projectthere tendstobe some verymessydata anda lotof compromisesondatabase design.Butthenhistoryismessyandearlymedievalhistoryisparticularlyso.The Charlemagne database isn’tperfect,butdespite all itsimperfection,we hope it’ll still be avaluable tool forfuture researchers.