Electronic
               Linking Text References to
Corpora for
 Ancient       Relevant Digital Resources
Languages      ...
A Microformat for Canonical Texts
                         References

                   • Topic: how to link secondary s...
Digital Library on Classics:
                         the State of the Art

                   • A few of on-line secondar...
Current e-scholarship scenarios (1)

                   Scenario 1

                   John is a scholar on Greek Literatu...
Current e-scholarship scenarios (2)

                   Scenario 2




                     John's colleague points out to...
Current e-scholarship scenarios (3)




             In order to have a significant e-reading experience, John would be ab...
New e-scholarship scenarios (1)




                   • Semantic understanding of text references by
                    ...
New e-scholarship scenarios (2)

              •    Value added services
                   (VAS) for scholars
           ...
From printed to digital libraries

                   • Find new constructive paradigms to take
                     advan...
The evolution of ancient languages
                         corpora

                   • TLG (1970s) -> mass storage and ...
A new paradigm for building on-line
                          corpora: the CTS protocol (1)

                   • CTS web ...
A new paradigm for building on-line
                       corpora: the CTS protocol (2)
               • Text Server
    ...
Reference Linking in the Digital Library




Matteo Romanello   Electronic Corpora for Ancient Languages - Prague, Novembe...
Linking primary to secondary sources
                        on-line: state of the art

                   • Two very loos...
A digital companion to printed canonical
                         texts references
                   • Problem: provide a...
Microformats or RDF?



               • Mfs = a bottom-up way to Semantic Web (real
                 world semantics or l...
Microformats vs RDF




     Microformats




Matteo Romanello   Electronic Corpora for Ancient Languages - Prague, Novemb...
Microformats or RDF?




Matteo Romanello   Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007   ...
Microformats or RDF?




Matteo Romanello   Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007   ...
Microformats: definition

                   • Microformats are:                              • Microformats are not:
    ...
Texts references: different use cases

                   1. Politics
                   2. like Aristotle claims
        ...
Designing a MF for Canonical Texts
                         References (1)
                   • Start from a specific prob...
Designing a MF for Canonical Texts
                         References (2)
                   • Modularity and embeddabili...
Designing a MF for Canonical Texts
                       References (3)


                    Reference
                 ...
The Microformat in action

                   • Get some valid microformatted references
                   • Tag resource...
The Microformat in action

    Green icons means that
    Operator is working...                                          ...
The Microformat in action




Matteo Romanello   Electronic Corpora for Ancient Languages - Prague, November 16th -17th 20...
The Microformat in action




Matteo Romanello   Electronic Corpora for Ancient Languages - Prague, November 16th -17th 20...
The Microformat in action




Matteo Romanello   Electronic Corpora for Ancient Languages - Prague, November 16th -17th 20...
Benefits for scholarship on Ancient
                       Languages
               • Citations encoded with a MF express ...
TODOs

                   • Discussion on Microformats' mailing lists and wiki
                   • Advocacy and support b...
References

               •
                   John Allsopp, Microformats: Empowering Your Markup for
                   ...
Upcoming SlideShare
Loading in …5
×

M.Romanello Ecal Presentation

1,669 views

Published on

Published in: Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,669
On SlideShare
0
From Embeds
0
Number of Embeds
26
Actions
Shares
0
Downloads
11
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

M.Romanello Ecal Presentation

  1. 1. Electronic Linking Text References to Corpora for Ancient Relevant Digital Resources Languages Over The Web Prague, November Matteo Romanello matteo.romanello@yahoo.it University “Ca' Foscari” of Venice 16th -17th 2007
  2. 2. A Microformat for Canonical Texts References • Topic: how to link secondary sources to corpora of ancient languages texts? • Goal: to give scholars reading the Digital Library's primary and secondary sources more powerful research tools and a richer reading experience • Focus: references to Canonical Texts in XHTML • Examples' Scope: Classical (Greek and Latin) literature Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 2/32
  3. 3. Digital Library on Classics: the State of the Art • A few of on-line secondary sources (journal articles and monographies) available as (X)HTML • A few of on-line authoritative and born-digital journals: e.g. Classics@ published by the Harvard's Center for Hellenic Studies • Some On-line Text Corpora (Perseus and other minor scattered collections) • Some resources and reviews of electronic resources for humanists, reviews of books... • Research blogs Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 3/32
  4. 4. Current e-scholarship scenarios (1) Scenario 1 John is a scholar on Greek Literature and wants to find all on-line articles or Author of Iliad electronic resources related to the verse and Odissey he is focusing on (Hom. Il. 20.249). Then he submits to Google a query like 'Hom. Il. 20.249' and what Google Homer Homère retrieves is not pertinent or interesting. Ordinary search engine are just a text based (no semantics, language dependent etc.). Omero John would have a more precise or n specialized search engine available, ... perhaps capable of understanding the semantic of the reference he typed in as query string. Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 4/32
  5. 5. Current e-scholarship scenarios (2) Scenario 2 John's colleague points out to him that Gregory Nagy within a passage of 2nd chapter mentions the passage John is interested about. John finds an on-line version of the book and open it up in his browser... Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 5/32
  6. 6. Current e-scholarship scenarios (3) In order to have a significant e-reading experience, John would be able to read the cited verse in its context, to compare the text of that verse as recorded in different manuscripts, to read the same passage in a given translation or read a commentary on it. Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 6/32
  7. 7. New e-scholarship scenarios (1) • Semantic understanding of text references by web browser • Research of resources pertinent to the author, the work or the precise text passage referred to Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 7/32
  8. 8. New e-scholarship scenarios (2) • Value added services (VAS) for scholars – Reference linking – Related resources – Targeted and semantic-oriented search – Different exemplars of a work • Problems: 1) To build a distributed library 2) To provide VAS linking secondary to primary sources Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 8/32
  9. 9. From printed to digital libraries • Find new constructive paradigms to take advantage of net's properties • In a network environment: – Library universally distributed and with higher granularity – Provide reference linking • Reference linking to primary sources (from references in secondary sources): – Ex. move from the citation Hom. Il. 1.1 to all available translations, comparing critical editions and finding related resources Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 9/32
  10. 10. The evolution of ancient languages corpora • TLG (1970s) -> mass storage and rapid retrieval • Perseus (1980s) -> richer media and higher level data structures • DLs + web protocol -> convergence of – XML related technologies: • TEI (encoding) • XML Db (storage of structured data) • Query capabilities over http protocols – Web services communication over REST protocol – Success of a distributed architecture (cfr. OAI-MHP) Which protocol? Canonical Text Services protocol Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 10/32
  11. 11. A new paradigm for building on-line corpora: the CTS protocol (1) • CTS web protocol: – new paradigm for building electronic corpora – gives hierarchical access to works as XML-TEI files – lies on the model described by FRBR – developed by Neel Smith et al. at Harvard's CHS – Built on the Registry Services Protocol (v. 1.0.rc1) -> authority lists • Some CTS related projects: – Perseus' CTS interface – Multitext Homer Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 11/32
  12. 12. A new paradigm for building on-line corpora: the CTS protocol (2) • Text Server CTS-compliant • Texts: XML TEI • Textgroup and Works are identified by URNs • Collections described by authority lists Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 12/32
  13. 13. Reference Linking in the Digital Library Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 13/32
  14. 14. Linking primary to secondary sources on-line: state of the art • Two very loosely coupled systems • No born-digital equivalent to printed references • Most of projects use an internal linking system: – Worthy degree of hypertextuality – Fairly closed systems of hard-linked resources • Digital references == strings – No semantic information – No aware information processing – Disambiguation of abbreviations and implicit statementes is left to the reader Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 14/32
  15. 15. A digital companion to printed canonical texts references • Problem: provide a digital companion to printed references – to express references in a simple and semantic way • exploiting the opportunities given by the digital medium • Separating semantics from presentational matters • Solution: – mapping references to requests compliant to the protocol to build a distributed library (CTS) – embedding chunks of semantic information within XHTML docs • Implementation: Microformats (from Web 2.0) • Goal: to design a Microformat for Canonical Text references Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 15/32
  16. 16. Microformats or RDF? • Mfs = a bottom-up way to Semantic Web (real world semantics or lower-case semantic web) • Used within blogs for friendships, geographical data, reviews... • Firefox 3 -> native support for Microformats (microformatted content display integrated in the UI) • Not the only way to embed metadata inside common tag elements – RDFa <http://www.w3.org/TR/xhtml-rdfa-primer/> proposed by W3C Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 16/32
  17. 17. Microformats vs RDF Microformats Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 17/32
  18. 18. Microformats or RDF? Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 18/32
  19. 19. Microformats or RDF? Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 19/32
  20. 20. Microformats: definition • Microformats are: • Microformats are not: – XHTML (POSH) – A new language compounds – An attempt to change everyone's current – A set of design behavior principles for formats – set of simple open data • Goals: formats built upon – Make data reusable and existing and widely interoperable among adopted standards webservices and mashup applications Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 20/32
  21. 21. Texts references: different use cases 1. Politics 2. like Aristotle claims 3. Politics of Aristotle 4. Artist. Pol. 1304B 5. Line 1 of the first book of Homer's Iliad 6. Hom. Il. I 1 7. Α 1 (== Upper-case Alpha 1, hellenistic books notation) Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 21/32
  22. 22. Designing a MF for Canonical Texts References (1) • Start from a specific problem (principle #1) – Problem: link secondary to primary sources on the web • Reuse building blocks from widely adopted standards (princ. #4) – Canonical texts citation scheme widely used among scholars on Classical Literature – Canon of Greek Literature provided as authority list compliant to the Registry Services Protocol • “Paving the Cowpaths” – keep the references appearing the same way as now, regarding to their appearance – Besides add semantics to references – Allow also internal linking systems Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 22/32
  23. 23. Designing a MF for Canonical Texts References (2) • Modularity and embeddability (princ. #5) 3. MF for Text 1. MF for author references 2. MF for works Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 23/32
  24. 24. Designing a MF for Canonical Texts References (3) Reference appearance Reference underlying microformatted content urn:cts:greekLit:tlg0012:tlg001:20.131-20.137 Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 24/32
  25. 25. The Microformat in action • Get some valid microformatted references • Tag resources from a popular review with urns instead of simple tags • Make the browser aware of microformatted contents adding support for CTSreference MF to Operator extension for Firefox • Add exemplifying actions to perform upon each MF: – find pertinent bookmarks on del.icio.us – search for pertinent research articles on CiteUlike Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 25/32
  26. 26. The Microformat in action Green icons means that Operator is working... Recognized microformats Available actions Some microformatted references Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 26/32
  27. 27. The Microformat in action Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 27/32
  28. 28. The Microformat in action Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 28/32
  29. 29. The Microformat in action Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 29/32
  30. 30. Benefits for scholarship on Ancient Languages • Citations encoded with a MF express references in a form: – Cross-language – Fully semantic, interoperable – reusable • The reference linking system produced is: – Open (client-side based) – Independent from specific solutions • Microformatted references allow: – targeted search -> more precise Information Retrieval tools (Pingerati: microformats search engine provided by developers at Technorati) Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 30/32
  31. 31. TODOs • Discussion on Microformats' mailing lists and wiki • Advocacy and support by real projects • Support of a digital library built upon CTS protocol • Urns as semantic tags and keywords in metadata description • Tools for easy authoring • Webservices taking advantage of such a MF: – An application that manages and exports references with several output formats to desktop applications – harvester of CTS repositories Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 31/32
  32. 32. References • John Allsopp, Microformats: Empowering Your Markup for Web 2.0, Berkeley, CA : friends of ed.; New York : Distributed to the book trade by Springer Verlag, 2007 • Neel Smith, “TextServer: Toward a Protocol for Describing Libraries”, Classics@ vol. 2, edition of April 3, 2004. • G. Crane et al., 'Beyond digital incunabula: Modeling the next generation of digital libraries', Proceedings of the 10th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2006) vol. 4172. • The Canonical Text Services (CTS) Protocol, current version: 1.1<http://katoptron.holycross.edu/cocoon/diginc/specs/cts > • The Registry Services Protocol, current version: 1.0.rc1 < http://katoptron.holycross.edu/cocoon/diginc/specs/registry > Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 32/32

×