Agnes Thomas, Francesco Mambrini & Matteo Romanello (DAI, Berlin)
'Insights in the World of Thucydides: The Hellespont Project as a research environment for Digital History'.
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday August 9th.
The Hellespont Project (German Archaeological Institute and Tufts University) aims to integrate two of the largest online collections for the study of Antiquity, the Perseus Digital Library and the Arachne archaeological database, in a dynamic digital research environment. Historians will have access to materials and resources of heterogeneous type, like ancient texts, archaeological evidence, historical background, and modern scholarly literature, while the documents related to each single historical event taken from the textual evidence will be interconnected through the CIDOC-CRM model.
Hellespont as a case study focuses on a limited historical period, the 50-year period in the history of Athens between the end of the Persian Wars (479 BCE) and the outburst of the Peloponnesian War (431 BCE). Furthermore, it follows the narration presented by the most important written source, chapters 1.89-118 of the Histories of Thucydides, who was a contemporary to some of the facts. One of the point of departure for the project is the annotation of Thucydides' text with multiple layers of linguistic information. Our goal is really to create a "digital sourcebook" including a lot of machine-actionable information, where historians can go to find references to sources, and tools to help linguistic analysis of the original texts.
Documents are bridged using the event-based CIDOC-CRM. We are working with two different concepts of events. In CIDOC ontology, events encompass all changes of states in cultural systems: they are identified by reference to historical scholarship. In Ancient History, where event reconstruction is mostly based on the interpretation of written sources, this definition isinsufficient. We are therefore implementing a data-driven approach, based on the semantic/syntactic strategies that express mutation in the external words through language. We aim to identify such strategies through a fine-grained semantic annotation of the written ancient texts.
We are going to present the digitally analysed text of Thucydides including different kind of additional information in a single Virtual Research Environment (VRE). The interface, which is currently still being implemented, is based on the same idea of GapVis, that is a visual interface for reading texts providing the user with multiple views on the same passage of text. In the presentation we will show the most important parts of the different views the user will access in the interface.
contributions of lexicography and corpus linguistics to a theory of language ...ayfa
The document discusses the contributions of lexicography and corpus linguistics to a theory of language performance. It summarizes key points from Noam Chomsky's early work and the subsequent focus on competence over performance in linguistics. While acknowledging Chomsky's influence, it argues that corpus linguistics provides evidence that a theory of language should consider gradations of grammaticality rather than sharp divisions, and focus on what is probable rather than just possible in a language. It also notes that linguistic theory should aim to better characterize the cognitive and social realities of language use.
The document analyzes Paula Underwood's book "The Walking People" as an example of a persistent conversation. It argues that the oral history contained within qualifies as it is meant to transfer knowledge across generations over many millennia. The processes used, such as designated storytellers and encouraging listeners to interact, help ensure its persistence. The document also notes challenges with ensuring the long-term persistence of modern digital records compared to oral traditions.
This document outlines 5 keys to building a successful digital business platform:
1. Provide functionalities and agility to quickly create digital experiences while retaining creative freedom.
2. Ensure flexibility and easy integration of different digital components and back-office systems.
3. Develop in the cloud to ensure high performance and ability to handle large volumes of traffic.
4. Connect digital experiences across multiple channels from web to mobile to stores for an omnichannel experience.
5. Personalize experiences for each user to increase engagement, conversions and performance.
The passage discusses the importance of summarization in efficiently conveying key information from lengthy documents. It notes that effective summaries distill the most critical details and events into a brief yet informative overview, allowing readers to quickly understand the core content and significance of longer texts. Summarization is thus a useful skill that can help people navigate extensive information more easily.
Patent Research Analysis - Einfolge technologiesNitin Rajput
Einfolge Global Leader in patent and market research firm, Our expert offering Patent and IPR service, intellectual property rights, market and Business research, clinical data management and legal data management services.
The document presents a scenario where a boyfriend asks his girlfriend to send a hot picture of herself. It then describes two follow up situations where the boyfriend receives the picture while with friends, and one friend asks to see the picture or have it sent to his phone. The document poses questions about the decisions that would be made in these situations regarding sharing or distributing the private picture more widely.
contributions of lexicography and corpus linguistics to a theory of language ...ayfa
The document discusses the contributions of lexicography and corpus linguistics to a theory of language performance. It summarizes key points from Noam Chomsky's early work and the subsequent focus on competence over performance in linguistics. While acknowledging Chomsky's influence, it argues that corpus linguistics provides evidence that a theory of language should consider gradations of grammaticality rather than sharp divisions, and focus on what is probable rather than just possible in a language. It also notes that linguistic theory should aim to better characterize the cognitive and social realities of language use.
The document analyzes Paula Underwood's book "The Walking People" as an example of a persistent conversation. It argues that the oral history contained within qualifies as it is meant to transfer knowledge across generations over many millennia. The processes used, such as designated storytellers and encouraging listeners to interact, help ensure its persistence. The document also notes challenges with ensuring the long-term persistence of modern digital records compared to oral traditions.
This document outlines 5 keys to building a successful digital business platform:
1. Provide functionalities and agility to quickly create digital experiences while retaining creative freedom.
2. Ensure flexibility and easy integration of different digital components and back-office systems.
3. Develop in the cloud to ensure high performance and ability to handle large volumes of traffic.
4. Connect digital experiences across multiple channels from web to mobile to stores for an omnichannel experience.
5. Personalize experiences for each user to increase engagement, conversions and performance.
The passage discusses the importance of summarization in efficiently conveying key information from lengthy documents. It notes that effective summaries distill the most critical details and events into a brief yet informative overview, allowing readers to quickly understand the core content and significance of longer texts. Summarization is thus a useful skill that can help people navigate extensive information more easily.
Patent Research Analysis - Einfolge technologiesNitin Rajput
Einfolge Global Leader in patent and market research firm, Our expert offering Patent and IPR service, intellectual property rights, market and Business research, clinical data management and legal data management services.
The document presents a scenario where a boyfriend asks his girlfriend to send a hot picture of herself. It then describes two follow up situations where the boyfriend receives the picture while with friends, and one friend asks to see the picture or have it sent to his phone. The document poses questions about the decisions that would be made in these situations regarding sharing or distributing the private picture more widely.
This introduction provides an overview of the research presented in the collection Multi modal Discourse Analysis. The papers represent early work in extending systemic-functional linguistics to analyze discourse that uses multiple semiotic resources beyond language.
The collection is divided into three parts focusing on different media: three-dimensional objects and space, electronic media and film, and print media. Across the papers, new social semiotic frameworks are developed and applied to analyze meaning constructed through integrated language and visual resources in genres like architecture, museums, cities, film, hypertext, and advertisements. The theoretical approach draws on systemic-functional linguistics and particularly Michael O'Toole's work on analyzing architecture. Computer-assisted analysis is also explored. Overall, the
This document provides an overview of interpreting as an interdisciplinary field of study that challenges boundaries. It discusses the origins and etymologies of terms related to interpreting like "interpreter", "dragoman", and "diermeneas". Interpreting inhabits margins and crosses boundaries between languages, cultures, disciplines, and spaces. The document also summarizes research on interpreting strategies, the role of interpreting in human rights, and interpreting in the European Union. Key topics covered include agency in conference interpreting, German-Greek interpreting strategies, interpreting services and equal rights in Sweden and Greece, and interpreting training programs.
Exploring rhetoric in the Electronic EnlightenmentMartin Wynne
An exploration of the steps necessary to prepare a corpus of (mainly) eighteenth century correspondence and make it available for interactive exploration of linguistic and stylistic features.
This document discusses the relationship between corpus linguistics and generative grammar. It explains that generative grammarians aim for explanatory adequacy, seeking to develop theories of universal grammar, while corpus linguists prioritize descriptive adequacy through detailed analysis of language data. While this leads to differences in their approaches, the document argues that corpora can still contribute to linguistic theory testing, particularly functional theories interested in language variation. It provides an example analysis of coordination ellipsis across genres to illustrate this point. Overall, corpora are best suited to descriptive and functional analyses, though some theoretical insights have also been gained from corpus-based studies within frameworks like government and binding theory.
This document discusses the concepts of transculturation, transliteracy, and generative poetics in the context of global communications and its effects on language, identity, and creative practices. It analyzes John Cayley's digital work "Translation" which uses machine algorithms to transform text in a way that abstracts it to its underlying structural patterns while removing recognizable semantics. The document argues that meaning depends on context, and that language, culture, identity, and technology should be seen not as isolated concepts but as constantly regenerating networks of relations that inform each other.
This document discusses the field of corpus linguistics and its relationship to other fields like cognitive linguistics and lexicography. It makes the following key points:
1. Corpus linguistics believes language should be studied through large collections of real-world texts rather than through intuition. It focuses on patterns between words and their meanings in context.
2. While corpus linguistics has influenced fields like lexicography, it is still developing its own theoretical foundations, especially regarding semantics.
3. Corpus linguistics differs from cognitive linguistics in that it sees meaning as arising from language use rather than internal mental representations. It studies what meaning expressions convey based on how language communities use them.
From the article just published in Psychology Research to my presentation on Monday 20, Nobvember 2023 on DISJUNCTURE vs REVOLUTION, POSTGRESSION vs. PROGRESSION, the central question of the emergence of language and the passage from oral language will be central. A video presentation covering the first part of the general topic with the newly discovered Hominin Homo Naledi in Souith Africa in the background on IFIASA site, presents this Hominin who had reached the level of transcribing his oral language into symbolical geometric signs. The second part on the phylogeny of language from the emergence of oral articulatred language to the writing of of all languages will openly being the question of freedom and freedom of choice in archaeological times for Hominins. The third part on the Versailles Treaty and how it still dictates the present and future of the world will be kept for publication.
Within 15-20 years ouor appeoach to the emergence of Humanity on this planet has run a tremendous distance and we can now envisage that human mental and culturazl characteristics existed several hundred years earlier than we though around 2000. Somze of these chjaracteristics also existed in pre-Sapiens hominin species like Naledis and Neanderthals and certainly Denisovans, plus some even older species. That’s why the brutal events we are still going through in our times are pathetic. And miserable.
Homo Naledi at IFIASA 2023, Romania
https://www.youtube.com/watch?v=fh_Vmm78v_M , 43 minutes 24
A full presentation of Homo Naledi and what he means for archaeology and anthropology. He buried his dead in underground caves. He inscribed symbolic hashtags and other geometric forms on the walls of the main burial chamber over the graves. He had reached the first stage of symbolic writing, the engravings representing a few ritualistic oral formulas. Just one step before symbolical signs for phonemes and/or syllables, the opening gate to alphabetical writing.
The development of Writing at IFIASA 2023, Romania
https://www.youtube.com/watch?v=t4Ak77DFPoo, 51 minutes 29
A full presentation of what happened after the lecture on Homo Naledi, moving from symbolical non-representational mostly geometric inscriptions and engravings to symbolical signs for phonemes and syllables that lead to alphabetical writing, a generally progressive evolution with cases like Egyptian hieroglyphs and Maya glyphs that kept their old Holistic representational structures and yet became phonetic for syllables or ancillary "words." What is writing the extension of, in Marshall McLuhan's line?
Digital Classicist London Seminars 2013 - Seminar 7 - Federico Boschetti & Br...DigitalClassicistLondon
An Integrated System For Generating And Correcting Polytonic Greek OCR
Federico Boschetti (CNR, Pisa) and Bruce Robertson (Mount Allison University, Canada)
Digital Classicist London & Institute of Classical Studies seminar 2013
Friday July 19th at 16:30, in Room S264, Senate House, Malet Street, London WC1E 7HU
In many fields, the digital books revolution provides wide and highly detailed access to pertinent texts; but this revolution has left behind scholars working with ancient Greek. While it is true that Hellenists have had digitized canonical texts for many years, these collections' relatively limited scope and restrictive licenses are increasingly at odds with recent currents in computer-based humanities research: linked data, large-scale text mining, and syntatic treebanking, to name a few. Perhaps the most important impediments to digitizing polytonic Greek have been the lack of: a high-quality optical character recognition for this script, especially under open-source licenses; and an assisted editor for polytonic Greek OCR output. In this seminar, we present a integrated system that fills these critical gap, making it possible for polytonic Greek texts to be digitized en masse.
Rigaudon OCR is a complete suite of scripts, python code and data required for producing polytonic Greek OCR. It comprises: an OCR engine based on Gamera with many features specific to the recognition of polytonic Greek and specific classifiers to identify the characters in Teubner, Teubner-sans-serif, OCT/Loeb, and Didot editions. It includes an automatic spellchecker designed to correct Greek OCR errors, and it has a process for combining existing, high-quality Latin-script OCR output with parallel Greek output, as illustrated by this papyrological text. Finally, it coordinates these processes through Sun Grid Engine scripts required to queue and parallelize these processes.
Digital Classicist London Seminars 2013 - Seminar 6 (part 2) - Greta Franzini DigitalClassicistLondon
Greta Franzini (University College London)
'A catalogue of digital editions: Towards a digital edition of Augustine's de Civitate Dei'.
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday July 12th.
The focus of my doctoral studies at the UCL Centre for Digital Humanities is the creation of a digital edition of the oldest surviving manuscript of S. Augustine's De Civitate Dei. The manuscript dates back to the early fifth century and most of the existing, scarce research we have predates the 1950s. Its much debated provenance and authorship, due to it being contemporary to Augustine himself, are as intriguing as its rare palaeographical features and marginalia. My research seeks to, firstly, examine best practice in the field of digital editions by collating relevant evidence in a detailed catalogue of extant digital editions. The catalogue records features, scope, philological as well as technological aspects of each edition and aims at becoming a collaborative scholarly endeavour for the benefit of the Digital Humanities community. Secondly (and consequently), lessons learnt from the catalogue will inform the production of an electronic edition of De Civitate Dei, which will include transcriptions of the text and the scholia, high-definition images, a short critical apparatus, as well as background information and links to relevant resources.
A catalogue of digital editions is greatly beneficial as it provides: an accessible, unique record of which texts have had digital editions created and the historical period they belong to; a data bank of features, tools, licences, funding bodies and locations; an insight into past, present and future projects; the possibility of viewing trends or patterns (e.g. what time periods are most covered or which institutions produce the largest number of digital editions); a platform where collaborators can engage in live discussions and update information as it becomes available; a means of identifying which areas need to be improved.
Interesting facts are already beginning to emerge: several projects, for instance, have not set up analytics as a means of studying usage; projects urging the digital reunification of manuscript fragments are often internally fragmented themselves, having split the project between institutions rather than centralising the material for easy retrieval and management; and TEI guidelines are not as widely adopted in the field of digital editions as we might think.
Digital Classicist London Seminars 2013 - Seminar 6 (part 1) - Eleni Bozia DigitalClassicistLondon
This document describes a web-based application called the Digital Epigraphic Archive (DEA) that facilitates the preservation, study, and dissemination of ancient inscriptions and archaeological artifacts. The DEA uses a low-cost method to digitize paper squeezes of inscriptions by scanning them twice with different light sources and then uses computer vision techniques to reconstruct the 3D surface and perform automated epigraphic analysis, including letter segmentation, grouping, and clustering. The DEA was tested on fragments from the archaeological site of Epidauros and was able to accurately reconstruct 3D surfaces and analyze letterforms.
Dot Porter (University of Pennsylvania)
'The Medieval Electronic Scholarly Alliance: a federated platform for discovery and research'.
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday July 5th.
The Medieval Electronic Scholarly Alliance (MESA) is a federated international community of scholars, project, institutions, and organizations engaged in digital scholarship within the field of medieval studies. Funded by the Andrew W. Mellon Foundation, MESA seeks both to provide a community for those engaged in digital medieval studies and to meet emerging needs of this community, including making recommendations on technological and scholarly standards for electronic scholarship, the aggregation of data, and the ability to discover and repurpose this data.
This presentation will focus on the discovery aspect of MESA, and how it might serve the non-digital medievalist who may nevertheless be interested in finding and using digital resources. Starting with a history of medievalists and their interactions with digital technology as told through three data sets (the International Congress on Medieval Studies (first held in 1962), arts-humanities.net (a digital project database in the UK, sponsored by JISC and the Arts & Humanities Research Council), and two surveys, from 2002 and 2011, that looked specifically at medievalists' use of digital resources), I will draw out some potential issues that this history has for the current developers of digital resources for medievalists, and investigate how MESA might serve to address these issues.
Valeria Vitale (King's College London)
'An Ontology for 3D Visualization in Cultural Heritage'.
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday June 14th.
To date, 3D computer graphics and modelling techniques have been used in the study of the ancient world mainly as a means to display traditional research. The value of these digital techniques has been often assessed merely on the degree of graphic aesthetic quality.
The pursuit of "photorealism" has proven ineffective in engaging the audience but also scientifically misleading, as it suggests that is possible to reproduce an artefact or scene "exactly as it was" in the past.
Behind every scholarly 3D visualisation is a thorough study of excavation records, iconographic documentation, ancient literary sources, artistic canons and precedents. However, this valuable research (that may lead to new discoveries in the field) is not always detectable in the final visual outcome.
The London Charter for the Computer-based Visualisation of Cultural Heritage made a huge step forward in the regulation of scholarly 3D visualisation—prescribing that researchers' choices and motivation must all be documented. No 3D model could be considered a scholarly resource if its research method was not "transparent".
The London Charter presents methodological guidelines for recording this data, but does not go as far as to offer a formal framework in which to place this information; each modeller is left to simply follow their own style. Moreover, the clients who commissioned the 3D model (such as museums or other cultural institutions) are frequently more interested in the final product than in the rationale which is often completely overlooked and not circulated (or, in the worst case, dropped from the budget line altogether).
Since there are programming languages that enable 3D environments to successfully interact with html, I propose that it would be useful to create one or more ontologies to standardise the verbal component of the documentation, embedding it in the 3D model itself.
Tom Brughmans (Southampton)
'Exploring visibility networks in Iron Age and Roman Southern Spain with Exponential Random Graph Models'
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday June 7th.
Many archaeological applications of formal network techniques consist of an exploration of empirically attested archaeological entities linked by relationships (of whatever nature the researcher considers meaningful). Among the most common issues with these exploratory approaches are how different data types can be used to create networks or validate hypothetical relational processes and how long-term change in connectivity can be explored. Through a case study on urban connectivity in Roman Southern Spain, this paper will discuss how Exponential Random Graph Models (ERGM) can help overcome such issues.
Traditional approaches to the archaeology of Roman Southern Spain have neglected the study of inter-urban connections (Keay 1998). Iron Age (ca. 5th c.BC to 3rd c.BC) and Roman (ca. 3rd c.BC to 5th c.AD) sites as well as different archaeological data types are often studied independently, which is necessary for a critical understanding of these different sources. However, all these sources were also once part of a single long-term cultural process. A multi-scalar exploratory network method is introduced that aims to explore aspects of the changing interactions between 190 sites dated to a range of ten centuries as evidenced through ten archaeological data types. This paper will focus in particular on networks of visibility. In this type of networks a pair of sites is connected when one site can be seen from the other. This exploratory approach is enhanced through the use of ERGM (Robins et al. 2007) for the analysis of subnetworks (particular configurations of connections between small sets of nodes). The assumptions archaeologists formulate about how relationships emerge relative to their position in the network (hypothetical past processes) can be tested using these subnetworks. With these models the frequency of certain subnetworks in random graphs and the empirically attested network is compared, to examine the probability that the subnetworks might have emerged through random processes. In doing this the border region between exploratory and confirmatory network analysis is explored. This paper will critically evaluate the potential and limitations of such an approach for archaeology.
This introduction provides an overview of the research presented in the collection Multi modal Discourse Analysis. The papers represent early work in extending systemic-functional linguistics to analyze discourse that uses multiple semiotic resources beyond language.
The collection is divided into three parts focusing on different media: three-dimensional objects and space, electronic media and film, and print media. Across the papers, new social semiotic frameworks are developed and applied to analyze meaning constructed through integrated language and visual resources in genres like architecture, museums, cities, film, hypertext, and advertisements. The theoretical approach draws on systemic-functional linguistics and particularly Michael O'Toole's work on analyzing architecture. Computer-assisted analysis is also explored. Overall, the
This document provides an overview of interpreting as an interdisciplinary field of study that challenges boundaries. It discusses the origins and etymologies of terms related to interpreting like "interpreter", "dragoman", and "diermeneas". Interpreting inhabits margins and crosses boundaries between languages, cultures, disciplines, and spaces. The document also summarizes research on interpreting strategies, the role of interpreting in human rights, and interpreting in the European Union. Key topics covered include agency in conference interpreting, German-Greek interpreting strategies, interpreting services and equal rights in Sweden and Greece, and interpreting training programs.
Exploring rhetoric in the Electronic EnlightenmentMartin Wynne
An exploration of the steps necessary to prepare a corpus of (mainly) eighteenth century correspondence and make it available for interactive exploration of linguistic and stylistic features.
This document discusses the relationship between corpus linguistics and generative grammar. It explains that generative grammarians aim for explanatory adequacy, seeking to develop theories of universal grammar, while corpus linguists prioritize descriptive adequacy through detailed analysis of language data. While this leads to differences in their approaches, the document argues that corpora can still contribute to linguistic theory testing, particularly functional theories interested in language variation. It provides an example analysis of coordination ellipsis across genres to illustrate this point. Overall, corpora are best suited to descriptive and functional analyses, though some theoretical insights have also been gained from corpus-based studies within frameworks like government and binding theory.
This document discusses the concepts of transculturation, transliteracy, and generative poetics in the context of global communications and its effects on language, identity, and creative practices. It analyzes John Cayley's digital work "Translation" which uses machine algorithms to transform text in a way that abstracts it to its underlying structural patterns while removing recognizable semantics. The document argues that meaning depends on context, and that language, culture, identity, and technology should be seen not as isolated concepts but as constantly regenerating networks of relations that inform each other.
This document discusses the field of corpus linguistics and its relationship to other fields like cognitive linguistics and lexicography. It makes the following key points:
1. Corpus linguistics believes language should be studied through large collections of real-world texts rather than through intuition. It focuses on patterns between words and their meanings in context.
2. While corpus linguistics has influenced fields like lexicography, it is still developing its own theoretical foundations, especially regarding semantics.
3. Corpus linguistics differs from cognitive linguistics in that it sees meaning as arising from language use rather than internal mental representations. It studies what meaning expressions convey based on how language communities use them.
From the article just published in Psychology Research to my presentation on Monday 20, Nobvember 2023 on DISJUNCTURE vs REVOLUTION, POSTGRESSION vs. PROGRESSION, the central question of the emergence of language and the passage from oral language will be central. A video presentation covering the first part of the general topic with the newly discovered Hominin Homo Naledi in Souith Africa in the background on IFIASA site, presents this Hominin who had reached the level of transcribing his oral language into symbolical geometric signs. The second part on the phylogeny of language from the emergence of oral articulatred language to the writing of of all languages will openly being the question of freedom and freedom of choice in archaeological times for Hominins. The third part on the Versailles Treaty and how it still dictates the present and future of the world will be kept for publication.
Within 15-20 years ouor appeoach to the emergence of Humanity on this planet has run a tremendous distance and we can now envisage that human mental and culturazl characteristics existed several hundred years earlier than we though around 2000. Somze of these chjaracteristics also existed in pre-Sapiens hominin species like Naledis and Neanderthals and certainly Denisovans, plus some even older species. That’s why the brutal events we are still going through in our times are pathetic. And miserable.
Homo Naledi at IFIASA 2023, Romania
https://www.youtube.com/watch?v=fh_Vmm78v_M , 43 minutes 24
A full presentation of Homo Naledi and what he means for archaeology and anthropology. He buried his dead in underground caves. He inscribed symbolic hashtags and other geometric forms on the walls of the main burial chamber over the graves. He had reached the first stage of symbolic writing, the engravings representing a few ritualistic oral formulas. Just one step before symbolical signs for phonemes and/or syllables, the opening gate to alphabetical writing.
The development of Writing at IFIASA 2023, Romania
https://www.youtube.com/watch?v=t4Ak77DFPoo, 51 minutes 29
A full presentation of what happened after the lecture on Homo Naledi, moving from symbolical non-representational mostly geometric inscriptions and engravings to symbolical signs for phonemes and syllables that lead to alphabetical writing, a generally progressive evolution with cases like Egyptian hieroglyphs and Maya glyphs that kept their old Holistic representational structures and yet became phonetic for syllables or ancillary "words." What is writing the extension of, in Marshall McLuhan's line?
Similar to Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al. (10)
Digital Classicist London Seminars 2013 - Seminar 7 - Federico Boschetti & Br...DigitalClassicistLondon
An Integrated System For Generating And Correcting Polytonic Greek OCR
Federico Boschetti (CNR, Pisa) and Bruce Robertson (Mount Allison University, Canada)
Digital Classicist London & Institute of Classical Studies seminar 2013
Friday July 19th at 16:30, in Room S264, Senate House, Malet Street, London WC1E 7HU
In many fields, the digital books revolution provides wide and highly detailed access to pertinent texts; but this revolution has left behind scholars working with ancient Greek. While it is true that Hellenists have had digitized canonical texts for many years, these collections' relatively limited scope and restrictive licenses are increasingly at odds with recent currents in computer-based humanities research: linked data, large-scale text mining, and syntatic treebanking, to name a few. Perhaps the most important impediments to digitizing polytonic Greek have been the lack of: a high-quality optical character recognition for this script, especially under open-source licenses; and an assisted editor for polytonic Greek OCR output. In this seminar, we present a integrated system that fills these critical gap, making it possible for polytonic Greek texts to be digitized en masse.
Rigaudon OCR is a complete suite of scripts, python code and data required for producing polytonic Greek OCR. It comprises: an OCR engine based on Gamera with many features specific to the recognition of polytonic Greek and specific classifiers to identify the characters in Teubner, Teubner-sans-serif, OCT/Loeb, and Didot editions. It includes an automatic spellchecker designed to correct Greek OCR errors, and it has a process for combining existing, high-quality Latin-script OCR output with parallel Greek output, as illustrated by this papyrological text. Finally, it coordinates these processes through Sun Grid Engine scripts required to queue and parallelize these processes.
Digital Classicist London Seminars 2013 - Seminar 6 (part 2) - Greta Franzini DigitalClassicistLondon
Greta Franzini (University College London)
'A catalogue of digital editions: Towards a digital edition of Augustine's de Civitate Dei'.
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday July 12th.
The focus of my doctoral studies at the UCL Centre for Digital Humanities is the creation of a digital edition of the oldest surviving manuscript of S. Augustine's De Civitate Dei. The manuscript dates back to the early fifth century and most of the existing, scarce research we have predates the 1950s. Its much debated provenance and authorship, due to it being contemporary to Augustine himself, are as intriguing as its rare palaeographical features and marginalia. My research seeks to, firstly, examine best practice in the field of digital editions by collating relevant evidence in a detailed catalogue of extant digital editions. The catalogue records features, scope, philological as well as technological aspects of each edition and aims at becoming a collaborative scholarly endeavour for the benefit of the Digital Humanities community. Secondly (and consequently), lessons learnt from the catalogue will inform the production of an electronic edition of De Civitate Dei, which will include transcriptions of the text and the scholia, high-definition images, a short critical apparatus, as well as background information and links to relevant resources.
A catalogue of digital editions is greatly beneficial as it provides: an accessible, unique record of which texts have had digital editions created and the historical period they belong to; a data bank of features, tools, licences, funding bodies and locations; an insight into past, present and future projects; the possibility of viewing trends or patterns (e.g. what time periods are most covered or which institutions produce the largest number of digital editions); a platform where collaborators can engage in live discussions and update information as it becomes available; a means of identifying which areas need to be improved.
Interesting facts are already beginning to emerge: several projects, for instance, have not set up analytics as a means of studying usage; projects urging the digital reunification of manuscript fragments are often internally fragmented themselves, having split the project between institutions rather than centralising the material for easy retrieval and management; and TEI guidelines are not as widely adopted in the field of digital editions as we might think.
Digital Classicist London Seminars 2013 - Seminar 6 (part 1) - Eleni Bozia DigitalClassicistLondon
This document describes a web-based application called the Digital Epigraphic Archive (DEA) that facilitates the preservation, study, and dissemination of ancient inscriptions and archaeological artifacts. The DEA uses a low-cost method to digitize paper squeezes of inscriptions by scanning them twice with different light sources and then uses computer vision techniques to reconstruct the 3D surface and perform automated epigraphic analysis, including letter segmentation, grouping, and clustering. The DEA was tested on fragments from the archaeological site of Epidauros and was able to accurately reconstruct 3D surfaces and analyze letterforms.
Dot Porter (University of Pennsylvania)
'The Medieval Electronic Scholarly Alliance: a federated platform for discovery and research'.
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday July 5th.
The Medieval Electronic Scholarly Alliance (MESA) is a federated international community of scholars, project, institutions, and organizations engaged in digital scholarship within the field of medieval studies. Funded by the Andrew W. Mellon Foundation, MESA seeks both to provide a community for those engaged in digital medieval studies and to meet emerging needs of this community, including making recommendations on technological and scholarly standards for electronic scholarship, the aggregation of data, and the ability to discover and repurpose this data.
This presentation will focus on the discovery aspect of MESA, and how it might serve the non-digital medievalist who may nevertheless be interested in finding and using digital resources. Starting with a history of medievalists and their interactions with digital technology as told through three data sets (the International Congress on Medieval Studies (first held in 1962), arts-humanities.net (a digital project database in the UK, sponsored by JISC and the Arts & Humanities Research Council), and two surveys, from 2002 and 2011, that looked specifically at medievalists' use of digital resources), I will draw out some potential issues that this history has for the current developers of digital resources for medievalists, and investigate how MESA might serve to address these issues.
Valeria Vitale (King's College London)
'An Ontology for 3D Visualization in Cultural Heritage'.
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday June 14th.
To date, 3D computer graphics and modelling techniques have been used in the study of the ancient world mainly as a means to display traditional research. The value of these digital techniques has been often assessed merely on the degree of graphic aesthetic quality.
The pursuit of "photorealism" has proven ineffective in engaging the audience but also scientifically misleading, as it suggests that is possible to reproduce an artefact or scene "exactly as it was" in the past.
Behind every scholarly 3D visualisation is a thorough study of excavation records, iconographic documentation, ancient literary sources, artistic canons and precedents. However, this valuable research (that may lead to new discoveries in the field) is not always detectable in the final visual outcome.
The London Charter for the Computer-based Visualisation of Cultural Heritage made a huge step forward in the regulation of scholarly 3D visualisation—prescribing that researchers' choices and motivation must all be documented. No 3D model could be considered a scholarly resource if its research method was not "transparent".
The London Charter presents methodological guidelines for recording this data, but does not go as far as to offer a formal framework in which to place this information; each modeller is left to simply follow their own style. Moreover, the clients who commissioned the 3D model (such as museums or other cultural institutions) are frequently more interested in the final product than in the rationale which is often completely overlooked and not circulated (or, in the worst case, dropped from the budget line altogether).
Since there are programming languages that enable 3D environments to successfully interact with html, I propose that it would be useful to create one or more ontologies to standardise the verbal component of the documentation, embedding it in the 3D model itself.
Tom Brughmans (Southampton)
'Exploring visibility networks in Iron Age and Roman Southern Spain with Exponential Random Graph Models'
Digital Classicist London & Institute of Classical Studies seminar 2013, Friday June 7th.
Many archaeological applications of formal network techniques consist of an exploration of empirically attested archaeological entities linked by relationships (of whatever nature the researcher considers meaningful). Among the most common issues with these exploratory approaches are how different data types can be used to create networks or validate hypothetical relational processes and how long-term change in connectivity can be explored. Through a case study on urban connectivity in Roman Southern Spain, this paper will discuss how Exponential Random Graph Models (ERGM) can help overcome such issues.
Traditional approaches to the archaeology of Roman Southern Spain have neglected the study of inter-urban connections (Keay 1998). Iron Age (ca. 5th c.BC to 3rd c.BC) and Roman (ca. 3rd c.BC to 5th c.AD) sites as well as different archaeological data types are often studied independently, which is necessary for a critical understanding of these different sources. However, all these sources were also once part of a single long-term cultural process. A multi-scalar exploratory network method is introduced that aims to explore aspects of the changing interactions between 190 sites dated to a range of ten centuries as evidenced through ten archaeological data types. This paper will focus in particular on networks of visibility. In this type of networks a pair of sites is connected when one site can be seen from the other. This exploratory approach is enhanced through the use of ERGM (Robins et al. 2007) for the analysis of subnetworks (particular configurations of connections between small sets of nodes). The assumptions archaeologists formulate about how relationships emerge relative to their position in the network (hypothetical past processes) can be tested using these subnetworks. With these models the frequency of certain subnetworks in random graphs and the empirically attested network is compared, to examine the probability that the subnetworks might have emerged through random processes. In doing this the border region between exploratory and confirmatory network analysis is explored. This paper will critically evaluate the potential and limitations of such an approach for archaeology.
Creative Restart 2024: Mike Martin - Finding a way around “no”Taste
Ideas that are good for business and good for the world that we live in, are what I’m passionate about.
Some ideas take a year to make, some take 8 years. I want to share two projects that best illustrate this and why it is never good to stop at “no”.
Information and Communication Technology in EducationMJDuyan
(𝐓𝐋𝐄 𝟏𝟎𝟎) (𝐋𝐞𝐬𝐬𝐨𝐧 2)-𝐏𝐫𝐞𝐥𝐢𝐦𝐬
𝐄𝐱𝐩𝐥𝐚𝐢𝐧 𝐭𝐡𝐞 𝐈𝐂𝐓 𝐢𝐧 𝐞𝐝𝐮𝐜𝐚𝐭𝐢𝐨𝐧:
Students will be able to explain the role and impact of Information and Communication Technology (ICT) in education. They will understand how ICT tools, such as computers, the internet, and educational software, enhance learning and teaching processes. By exploring various ICT applications, students will recognize how these technologies facilitate access to information, improve communication, support collaboration, and enable personalized learning experiences.
𝐃𝐢𝐬𝐜𝐮𝐬𝐬 𝐭𝐡𝐞 𝐫𝐞𝐥𝐢𝐚𝐛𝐥𝐞 𝐬𝐨𝐮𝐫𝐜𝐞𝐬 𝐨𝐧 𝐭𝐡𝐞 𝐢𝐧𝐭𝐞𝐫𝐧𝐞𝐭:
-Students will be able to discuss what constitutes reliable sources on the internet. They will learn to identify key characteristics of trustworthy information, such as credibility, accuracy, and authority. By examining different types of online sources, students will develop skills to evaluate the reliability of websites and content, ensuring they can distinguish between reputable information and misinformation.
🔥🔥🔥🔥🔥🔥🔥🔥🔥
إضغ بين إيديكم من أقوى الملازم التي صممتها
ملزمة تشريح الجهاز الهيكلي (نظري 3)
💀💀💀💀💀💀💀💀💀💀
تتميز هذهِ الملزمة بعِدة مُميزات :
1- مُترجمة ترجمة تُناسب جميع المستويات
2- تحتوي على 78 رسم توضيحي لكل كلمة موجودة بالملزمة (لكل كلمة !!!!)
#فهم_ماكو_درخ
3- دقة الكتابة والصور عالية جداً جداً جداً
4- هُنالك بعض المعلومات تم توضيحها بشكل تفصيلي جداً (تُعتبر لدى الطالب أو الطالبة بإنها معلومات مُبهمة ومع ذلك تم توضيح هذهِ المعلومات المُبهمة بشكل تفصيلي جداً
5- الملزمة تشرح نفسها ب نفسها بس تكلك تعال اقراني
6- تحتوي الملزمة في اول سلايد على خارطة تتضمن جميع تفرُعات معلومات الجهاز الهيكلي المذكورة في هذهِ الملزمة
واخيراً هذهِ الملزمة حلالٌ عليكم وإتمنى منكم إن تدعولي بالخير والصحة والعافية فقط
كل التوفيق زملائي وزميلاتي ، زميلكم محمد الذهبي 💊💊
🔥🔥🔥🔥🔥🔥🔥🔥🔥
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...indexPub
The recent surge in pro-Palestine student activism has prompted significant responses from universities, ranging from negotiations and divestment commitments to increased transparency about investments in companies supporting the war on Gaza. This activism has led to the cessation of student encampments but also highlighted the substantial sacrifices made by students, including academic disruptions and personal risks. The primary drivers of these protests are poor university administration, lack of transparency, and inadequate communication between officials and students. This study examines the profound emotional, psychological, and professional impacts on students engaged in pro-Palestine protests, focusing on Generation Z's (Gen-Z) activism dynamics. This paper explores the significant sacrifices made by these students and even the professors supporting the pro-Palestine movement, with a focus on recent global movements. Through an in-depth analysis of printed and electronic media, the study examines the impacts of these sacrifices on the academic and personal lives of those involved. The paper highlights examples from various universities, demonstrating student activism's long-term and short-term effects, including disciplinary actions, social backlash, and career implications. The researchers also explore the broader implications of student sacrifices. The findings reveal that these sacrifices are driven by a profound commitment to justice and human rights, and are influenced by the increasing availability of information, peer interactions, and personal convictions. The study also discusses the broader implications of this activism, comparing it to historical precedents and assessing its potential to influence policy and public opinion. The emotional and psychological toll on student activists is significant, but their sense of purpose and community support mitigates some of these challenges. However, the researchers call for acknowledging the broader Impact of these sacrifices on the future global movement of FreePalestine.
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...TechSoup
Whether you're new to SEO or looking to refine your existing strategies, this webinar will provide you with actionable insights and practical tips to elevate your nonprofit's online presence.
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
Digital Classicist London Seminars 2013 - Seminar 10 - Agnes Thomas et al.
1. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Insights in the World of Thucydides
The Hellespont Project as a research environment for Digital
History
A. Thomasab F. Mambrinib M. Romanellobc
aUniversität zu Köln
bDeutsches Archäologisches Institut, Berlin
cKing’s College, London
August 9, 2013
Thomas, Mambrini, Romanello The Hellespont Project
2. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Outline
1 Introduction
2 The GapVis Interface
3 Event annotation
Manual event annotation
Linguistic annotation
4 Secondary Literature
Thomas, Mambrini, Romanello The Hellespont Project
3. Introduction
The GapVis Interface
Event annotation
Secondary Literature
The Hellespont Project
Integrating Arachne and Perseus
October 2010 - September 2013
http://arachne.uni-koeln.de/drupal/?q=de/node/231
Thomas, Mambrini, Romanello The Hellespont Project
4. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Cooperating Institutions and Persons
German Archaeological
Institute Berlin:
Ortwin Dally
Reinhard Förtsch
Francesco Mambrini
Matteo Romanello
Wolfgang Schmidle
The Perseus Project:
Bridget Almas
Alison Babeu
Lisa Cerrato
Gregory Crane
Cologne Digital
Archaeology Laboratory:
Carina Berning
Robert Kummer
Alexander Recht
Marcel Riedel
Karen Schwane
Agnes Thomas
Thomas, Mambrini, Romanello The Hellespont Project
5. Introduction
The GapVis Interface
Event annotation
Secondary Literature
GapVis for Hellespont
Named entities, linguistic information, event annotation, and
bibliography connected in one interface:
A case study on Thuc. 1.89-118
Different formats (TEI, CIDOC-CRM, AGDT, PML. . . )
User interface based on GapVis:
http://nrabinowitz.github.io/gapvis
Thomas, Mambrini, Romanello The Hellespont Project
11. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Outline
1 Introduction
2 The GapVis Interface
3 Event annotation
Manual event annotation
Linguistic annotation
4 Secondary Literature
Thomas, Mambrini, Romanello The Hellespont Project
12. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Going through secondary literature
Thomas, Mambrini, Romanello The Hellespont Project
13. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Event List
Thomas, Mambrini, Romanello The Hellespont Project
14. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Oinophyta Event
Thomas, Mambrini, Romanello The Hellespont Project
15. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Myronides as a general
Thomas, Mambrini, Romanello The Hellespont Project
16. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Outline
1 Introduction
2 The GapVis Interface
3 Event annotation
Manual event annotation
Linguistic annotation
4 Secondary Literature
Thomas, Mambrini, Romanello The Hellespont Project
17. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Natural language and events
Thuc. 1.102.2
μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ
εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ
ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.
Translation
[They] invited them especially because [they] considered [them]
particularly skilled in siege operations, while, since the siege for
them was dragging on, [their] own deficiency in that sort of
warfare was clear: for otherwise [they] would have taken the
place by force.
Thomas, Mambrini, Romanello The Hellespont Project
18. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Natural language and events
Thuc. 1.102.2
μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ
εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ
ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.
Translation
[The siege of Ithome proved tedious, and the Lacedaemonians
called in, among other allies, the Athenians . . . ]
[They] invited them especially because [they] considered [them]
particularly skilled in siege operations, while, since the siege for
them was dragging on, [their] own deficiency in that sort of
warfare was clear: for otherwise [they] would have taken the
place by force.
Thomas, Mambrini, Romanello The Hellespont Project
19. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Natural language and events
Thuc. 1.102.2
μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ
εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ
ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.
Translation
[The siege of Ithome proved tedious, and the Lacedaemonians
called in, among other allies, the Athenians . . . ]
[They] invited them especially because [they] considered [them]
particularly skilled in siege operations, while, since the siege for
them was dragging on, [their] own deficiency in that sort of
warfare was clear: for otherwise [they] would have taken the
place by force.
Thomas, Mambrini, Romanello The Hellespont Project
20. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Natural language and events
Thuc. 1.102.2
μάλιστα δ᾿ αὐτοὺς ἐπεκαλέσαντο ὅτι τειχομαχεῖν ἐδόκουν δυνατοὶ
εἶναι, τοῖς δὲ πολιορκίας μακρᾶς καθεστηκυίας τούτου ἐνδεᾶ
ἐφαίνετο: βίᾳ γὰρ ἂν εἷλον τὸ χωρίον.
Translation
[The siege of Ithome proved tedious, and the Lacedaemonians
called in, among other allies, the Athenians . . . ]
[They] invited them especially because [they] considered [them]
particularly skilled in siege operations, while, since the siege for
them was dragging on, [their] own deficiency in that sort of
warfare was clear: for otherwise [they] would have taken the
place by force.
Thomas, Mambrini, Romanello The Hellespont Project
21. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
NLP Pipeline
Tokenization POS-Tagging
Syntactic
Parsing
Thematic
Roles
Information
Structure
Coreference
Resolution
Thomas, Mambrini, Romanello The Hellespont Project
22. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
NLP Pipeline
NLP Process Ancient Greek?
Chunking
Lemmatization
POS-tagging
Syntactic parsing
Word-sense disambiguation
Co-reference resolution
Semantic role annotation
Thomas, Mambrini, Romanello The Hellespont Project
23. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Using and Enhancing the available resources
The Ancient Greek Dependency Treebank
AGDT: treebank with word-by-word morphological and
dependency-based syntactical description
a step forward: semantic information
Thomas, Mambrini, Romanello The Hellespont Project
24. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Analytical Level
“Surface” syntax
a-a-1999.01.0199_book1-chapter89_3
AuxS
οἱ
Atr
γὰρ
AuxY
Ἀθηναῖοι
Sb
τρόπῳ
Adv
τοιῷδε
Atr
ἦλθον
Pred
ἐπὶ
AuxP
τὰ
Atr
πράγματα
Obj
ἐν
ηὐξήθησαν
Atr
.
AuxK
Thomas, Mambrini, Romanello The Hellespont Project
25. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Valency
The verbal node expresses a little drama. As a
drama, it implies a process and, most of the times,
actors and circumstances
L. Tesnière
Thomas, Mambrini, Romanello The Hellespont Project
26. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Tectogrammatical annotation
t-t_tree-grc-s1-root
root
γάρ1
PREC
atom
Ἀθηναῖος1
ACT
n.denot
ἔρχομαι1 enunc
PRED
v
πρᾶγμα1
DIR3 state
n.denot
ὅς1
ACMP circ
n.denot
#PersPron
ACT
n.denot
αὐξάνω1
RSTR
v
τρόπος1
MANN
n.denot
τοιόσδε1
RSTR
adj.pron.def.demon
.
.
.
Thomas, Mambrini, Romanello The Hellespont Project
27. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
From treebanks to event data-bases
Thomas, Mambrini, Romanello The Hellespont Project
28. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
What can you do with multi-layer trees?
“Meaningful” relations between NEs
[The Athenians]. . . brought
the territories of Boeotia and
Phocis under their obedience,
and withal razed the walls of
Tanagra and took of the
wealthiest of the Locrians of
Opus a hundred hostages,
and finished also at the same
time their long walls at home
(1.108.3)
Thomas, Mambrini, Romanello The Hellespont Project
29. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
Maps with semantically relevant relations
E.g. travels by sea
πλέω
(sail)
Actor
DIR 3 (to)
DIR1 (from)
The Athenians
Other NE's
Thomas, Mambrini, Romanello The Hellespont Project
30. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
What can you do with multi-layer trees?
Extraction and analysis of events
What actions do the Athenians perform?
Thomas, Mambrini, Romanello The Hellespont Project
31. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Manual event annotation
Linguistic annotation
What can you do with multi-layer trees?
Extraction and analysis of events
What actions do the Spartans perform?
Thomas, Mambrini, Romanello The Hellespont Project
32. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Related Secondary Literature (from JSTOR)
Figure : http://tiny.cc/GapVis-SecLit
Thomas, Mambrini, Romanello The Hellespont Project
33. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Mining JSTOR:
Where is Thuc. “hiding”?
A meaningful subsample
mining citations from all ~171k journal articles, not the best approach
curated bibliography (2009) before project started (CiteULike)
articles in JSTOR related to Thuc 1.89-118
343 articles, 62 journals
journals from bibliography as “seeds”
samples ~73k articles (out of ~171k)
top-down vs bottom-up bibliographic approach
Pros and Cons
comprehensive coverage; > 2 centuries; multilingual
data not openly licensed
Thomas, Mambrini, Romanello The Hellespont Project
36. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Extracting Citations: Challenges
sentence segmentation
sentence = sensible unit of context
both for extraction and data analysis (co-citation)
dirty OCR
invalid character sequences (e.g. n)
“inconsistent” use of punctuation
1, 110-15 ; 1.89.1, 1.90 ; I 1, 102, 1
solution: reason based on domain knowledge
similar references, surface similarity
fragments, papyri, inscriptions
Thomas, Mambrini, Romanello The Hellespont Project
37. Introduction
The GapVis Interface
Event annotation
Secondary Literature
Thank you!
Our contacts and temporary development server
agnes.thomas@uni-koeln.de
francesco.mambrini@dainst.de
mattero.romanello@dainst.de
http://www.tiny.cc/GapVis-Hellespont
Thomas, Mambrini, Romanello The Hellespont Project