This document discusses developing an Ontology for Historical Persons (OHP) to better structure prosopographical data on the semantic web. It provides examples of existing models like FOAF, TEI and DDH's factoid model. Developing a standardized OHP could help connect separate prosopography projects and move from closed to open collaboration. The OHP would define entities like persons, assertions, roles, events and relationships to provide a framework for consistently representing prosopographical data in a linked open manner. The document proposes an initial workshop to further explore and develop ideas for the OHP.
Exposing Humanities Data for Reuse and Linking - RED, linked data and the sem...Mathieu d'Aquin
Presented at the workshop of the "Reading Experience Database" (RED) project - London - 25/02/2011.
Discussion on how linked data can benefit research in humanities, using RED and data.open.ac.uk as early examples.
This document summarizes the state of open research data by outlining its evolution over time. It begins with centralized data centers in the 1960s and progresses to more collaborative models of data sharing through community agreements and online supplementary materials. The benefits of open data are discussed, including increased reproducibility and citation advantages for authors who share. While open data is ideal, achieving 3-star open standards according to the 5 star scheme is currently realistic. The future may bring stricter funding and publishing requirements to encourage more widespread data sharing.
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Open Research Data: Licensing | Standards | FutureRoss Mounce
This document provides an overview of open research data, including definitions, licensing, standards, and history. It defines open data as data that anyone can freely access, use, modify, and share with few restrictions. For data to be truly open, it recommends using a CC0 public domain waiver or an attribution-only license. It discusses issues with non-commercial and no derivatives restrictions. The document also provides guidance on technical aspects like recommended file formats and standards. It briefly summarizes the history of data sharing, from centralized data centers to online supplementary data to emerging data paper journals. The key messages are that data should be FAIR (Findable, Accessible, Interoperable, Reusable) and that open data benefits both
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
Linked Open Data promises to provide guiding principles to publish interlinked knowledge graphs on the Web in the form of findable, accessible, interoperable, and reusable datasets. In this talk I argue that while as such, Linked Data may be viewed as a basis for instantiating the FAIR principles, there are still a number of open issues that cause significant data quality issues even when knowledge graphs are published as Linked Data. In this talk I will first define the boundaries of what constitutes a single coherent knowledge graph within Linked Data, i.e., present a principled notion of what a dataset is and what links within and between datasets are. I will also define different link types for data in Linked datasets and present the results of our empirical analysis of linkage among the datasets of the Linked Open Data cloud. Recent results from our analysis of Wikidata, which has not been part of the Linked Open Data Cloud, will also be presented.
Exploration, visualization and querying of linked open data sourcesLaura Po
afternoon hands-on session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Exposing Humanities Data for Reuse and Linking - RED, linked data and the sem...Mathieu d'Aquin
Presented at the workshop of the "Reading Experience Database" (RED) project - London - 25/02/2011.
Discussion on how linked data can benefit research in humanities, using RED and data.open.ac.uk as early examples.
This document summarizes the state of open research data by outlining its evolution over time. It begins with centralized data centers in the 1960s and progresses to more collaborative models of data sharing through community agreements and online supplementary materials. The benefits of open data are discussed, including increased reproducibility and citation advantages for authors who share. While open data is ideal, achieving 3-star open standards according to the 5 star scheme is currently realistic. The future may bring stricter funding and publishing requirements to encourage more widespread data sharing.
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Open Research Data: Licensing | Standards | FutureRoss Mounce
This document provides an overview of open research data, including definitions, licensing, standards, and history. It defines open data as data that anyone can freely access, use, modify, and share with few restrictions. For data to be truly open, it recommends using a CC0 public domain waiver or an attribution-only license. It discusses issues with non-commercial and no derivatives restrictions. The document also provides guidance on technical aspects like recommended file formats and standards. It briefly summarizes the history of data sharing, from centralized data centers to online supplementary data to emerging data paper journals. The key messages are that data should be FAIR (Findable, Accessible, Interoperable, Reusable) and that open data benefits both
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
Linked Open Data promises to provide guiding principles to publish interlinked knowledge graphs on the Web in the form of findable, accessible, interoperable, and reusable datasets. In this talk I argue that while as such, Linked Data may be viewed as a basis for instantiating the FAIR principles, there are still a number of open issues that cause significant data quality issues even when knowledge graphs are published as Linked Data. In this talk I will first define the boundaries of what constitutes a single coherent knowledge graph within Linked Data, i.e., present a principled notion of what a dataset is and what links within and between datasets are. I will also define different link types for data in Linked datasets and present the results of our empirical analysis of linkage among the datasets of the Linked Open Data cloud. Recent results from our analysis of Wikidata, which has not been part of the Linked Open Data Cloud, will also be presented.
Exploration, visualization and querying of linked open data sourcesLaura Po
afternoon hands-on session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Ross Mounce
A talk given at the Geological Society of London, UK on 2016/03/09 as part of the Lyell meeting on Palaeoinformatics. http://www.geolsoc.org.uk/lyell16 #lyell16
Social media provides a rich source of contextual information for improving information retrieval, including user profiles, connections between users, comments, and shared content. However, using social media for research poses challenges due to limitations of APIs, non-representative user samples, and data that is constantly changing. Despite these issues, social media still offers opportunities to expand content representation, reduce vocabulary gaps, and surface viral content that improves search capabilities.
This document provides an overview of linked data and the semantic web. It discusses moving from a web of documents to a web of data by making data on the web more structured and interconnected. The key aspects covered include using URIs to identify things, providing structured data about those things via standards like RDF, and including links to other related data to improve discovery. The document also explains some of the core technologies involved like RDF, RDF syntaxes, vocabularies for describing data, and publishing and accessing linked data on the web.
This document discusses using linked data for digital humanities projects. It describes how linked data allows for the flexible integration of heterogeneous data, metadata, and background knowledge from various sources. By reusing web resources, vocabularies, and ontologies through web standards like URIs, RDF, and SPARQL, linked data enables efficient investigation of integrated research questions across collections, institutions, and domains. It also explains how data provenance is important for digital humanities and fits well with linked data through standards like PROV-O. Examples are provided of digital history projects and a linked data project on Dutch ships and sailors that demonstrates these concepts.
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
The presentation provides an overview of what an ontology is and how it can be used for representing information and for retrieving data with a particular focus on the linguistic resources available for supporting this kind of task. Overview of semantic-based retrieval approaches by highlighting the pro and cons of using semantic approaches with respect to classic ones. Use cases are presented and discussed
Open scholarship [a FOSTER open science talk]Ross Mounce
A talk by Dr Ross Mounce, given at the FOSTER Open Science event 4th September, King's College London http://www.fosteropenscience.eu/event/foster-discovering-open-practices-pgr-and-early-career-researchers-0
The internet has fast become the first port of call for all searches. The increasing array of chemistry-related resources now available provides chemists a direct path to the discovery of information, one previously accessed via library services and limited to commercial and costly resources. The diversity of information available online is expanding at a dramatic rate and a shift to publicly available resources offers significant opportunities in terms of the benefit to science and society. While the data available online do not generally meet the quality standards available from manually curated sources there are efforts afoot to gather scientists and “crowd source” an improvement in the quality of available data. This article will discuss the types of public compound databases available online, provide a series of example databases and focus on the benefits and disruptions associated with the increased availability of such data and integrating technologies to data-mine the available information.
Web Archives and the dream of the Personal Search EngineArjen de Vries
Keynote at the 4th Alexandria Workshop organised by Avishek Anand and Wolfgang Nejdl, L3S, Hannover (Germany). I argue that Web Archives should act as a pivot while revisiting the idea of decentralised search.
See also http://alexandria-project.eu/events/4th-int-alexandria-workshop-19-20-october-2017/
This document provides an introduction to the semantic web and library linked data. It discusses how library data is currently siloed but moving towards being published as linked open data using semantic web standards. Key points covered include the principles of linked data using URIs and RDF triples, examples of library linked data projects, and how RDA is being developed to support linked data. The goal is to make library data more accessible and useful by integrating it into the larger web of data.
Linked Open Data in Libraries Archives & MuseumsJon Voss
The document discusses the growing Linked Open Data (LOD) movement in libraries, archives, and museums (LODLAM). It notes that LODLAM allows these institutions to explore data interoperability both within the cultural sector and more broadly on the web. The document outlines several outcomes of a LODLAM summit, including outreach, education, developing use cases, and examining issues around copyright and licensing of open data. Examples are provided of institutions that have published bibliographic and other cultural data using open licenses.
This document summarizes a presentation given by three librarians on the role of librarians in the intelligence process. It discusses the competencies and skills that librarians possess, such as open source intelligence collection, data and metadata management, knowledge management, understanding human information behavior, and instructional design. It argues that these enable librarians to take on new roles in intelligence work, including advising, analyzing, and teaching analysts. The document also outlines principles that allow librarians to collaborate effectively, such as assessing information quality, adding value through analysis, focusing communication, and having a mission focus of improving effectiveness through knowledge creation and application.
This presentation was provided by Scott Ziegler of Louisiana State University during the NISO Virtual Conference, Open Data Projects, held on Wednesday, June 13, 2018.
Open Access for Early Career ResearchersRoss Mounce
My talk for the University of Bath Open Access Week session; 23rd October 2013.
http://www.bath.ac.uk/learningandteaching/rdu/courses/pgskills/modules/RP00335.htm
The document discusses open opportunities related to open access, open data, open source software and repositories. It provides an overview of key concepts like open access to scholarly research, open government and open data policies. Examples are given of open data sources from the US government and other organizations. Free and open source software tools for data analysis and visualization are also described. The document closes by discussing open data repositories and ensuring data is openly accessible and citable.
Data Communities - reusable data in and outside your organization.Paul Groth
Description
Data is a critical both to facilitate an organization and as a product. How can you make that data more usable for both internal and external stakeholders? There are a myriad of recommendations, advice, and strictures about what data providers should do to facilitate data (re)use. It can be overwhelming. Based on recent empirical work (analyzing data reuse proxies at scale, understanding data sensemaking and looking at how researchers search for data), I talk about what practices are a good place to start for helping others to reuse your data. I put this in the context of the notion data communities that organizations can use to help foster the use of data both within your organization and externally.
1. The document summarizes a pilot project between Cambridge University Library and the University of Glasgow aimed at improving research data management through better advice, training, and support for researchers.
2. Interviews with researchers found issues with file organization, storage, preservation, data sharing, and a lack of useful guidance and training.
3. The project recommends producing simple data management guidance, practical training resources with discipline-specific examples, connecting researchers to support staff, and working towards a data management infrastructure.
Metadata and the Amount of InformationKen Fujiuchi
The document discusses the vast amount of data, or metadata, that exists both digitally and physically. It notes that in 2003, statistics showed over 1,600 terabytes of paper-based information existed across libraries and journals, equivalent to millions of books and newspapers. Digitally, over 1.986 exabytes of data existed on hard disks in 1999, and the total information flowing over the internet that year was 532.897 exabytes, over 100 times more than all words ever spoken by humans.
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...EUDAT
Going from a web of document to a web of knowledge is one of the key goal set by the creator of the World Wide Web, Sir Tim Berners-Lee. This dream is becoming a reality more each day with the development and the integration of new formats and new technologies to represent data as knowledge graphs, interlinking concepts within documents or databases together. This presentation will provide an overview of the generic concepts supporting Linked Data, including formats, the existing technologies supporting these formats, introduce the key existing initiatives relying on these technologies. We will also address the challenge of semantic/knowledge modeling in science and in other domains and the need for more tools to support the use of these technologies. In particular, we will present the semantic annotation service B2NOTE and how the formats and technologies are used to extend the description of datasets within EUDAT and allow the possibility to create new datasets from multiples sources and multiple domains.
Visit https://eudat.eu/eudat-summer-school
The document discusses Resource Description Framework (RDF) and its role in representing data on the Semantic Web. It provides examples of how RDF can represent relationships between resources through triples and graphs, and compares this to how the same information would be represented in XML. It also discusses RDF Schema (RDFS) and the Ontology Web Language (OWL) as languages used to build ontologies that can express richer relationships between resources on the Semantic Web.
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Ross Mounce
A talk given at the Geological Society of London, UK on 2016/03/09 as part of the Lyell meeting on Palaeoinformatics. http://www.geolsoc.org.uk/lyell16 #lyell16
Social media provides a rich source of contextual information for improving information retrieval, including user profiles, connections between users, comments, and shared content. However, using social media for research poses challenges due to limitations of APIs, non-representative user samples, and data that is constantly changing. Despite these issues, social media still offers opportunities to expand content representation, reduce vocabulary gaps, and surface viral content that improves search capabilities.
This document provides an overview of linked data and the semantic web. It discusses moving from a web of documents to a web of data by making data on the web more structured and interconnected. The key aspects covered include using URIs to identify things, providing structured data about those things via standards like RDF, and including links to other related data to improve discovery. The document also explains some of the core technologies involved like RDF, RDF syntaxes, vocabularies for describing data, and publishing and accessing linked data on the web.
This document discusses using linked data for digital humanities projects. It describes how linked data allows for the flexible integration of heterogeneous data, metadata, and background knowledge from various sources. By reusing web resources, vocabularies, and ontologies through web standards like URIs, RDF, and SPARQL, linked data enables efficient investigation of integrated research questions across collections, institutions, and domains. It also explains how data provenance is important for digital humanities and fits well with linked data through standards like PROV-O. Examples are provided of digital history projects and a linked data project on Dutch ships and sailors that demonstrates these concepts.
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
The presentation provides an overview of what an ontology is and how it can be used for representing information and for retrieving data with a particular focus on the linguistic resources available for supporting this kind of task. Overview of semantic-based retrieval approaches by highlighting the pro and cons of using semantic approaches with respect to classic ones. Use cases are presented and discussed
Open scholarship [a FOSTER open science talk]Ross Mounce
A talk by Dr Ross Mounce, given at the FOSTER Open Science event 4th September, King's College London http://www.fosteropenscience.eu/event/foster-discovering-open-practices-pgr-and-early-career-researchers-0
The internet has fast become the first port of call for all searches. The increasing array of chemistry-related resources now available provides chemists a direct path to the discovery of information, one previously accessed via library services and limited to commercial and costly resources. The diversity of information available online is expanding at a dramatic rate and a shift to publicly available resources offers significant opportunities in terms of the benefit to science and society. While the data available online do not generally meet the quality standards available from manually curated sources there are efforts afoot to gather scientists and “crowd source” an improvement in the quality of available data. This article will discuss the types of public compound databases available online, provide a series of example databases and focus on the benefits and disruptions associated with the increased availability of such data and integrating technologies to data-mine the available information.
Web Archives and the dream of the Personal Search EngineArjen de Vries
Keynote at the 4th Alexandria Workshop organised by Avishek Anand and Wolfgang Nejdl, L3S, Hannover (Germany). I argue that Web Archives should act as a pivot while revisiting the idea of decentralised search.
See also http://alexandria-project.eu/events/4th-int-alexandria-workshop-19-20-october-2017/
This document provides an introduction to the semantic web and library linked data. It discusses how library data is currently siloed but moving towards being published as linked open data using semantic web standards. Key points covered include the principles of linked data using URIs and RDF triples, examples of library linked data projects, and how RDA is being developed to support linked data. The goal is to make library data more accessible and useful by integrating it into the larger web of data.
Linked Open Data in Libraries Archives & MuseumsJon Voss
The document discusses the growing Linked Open Data (LOD) movement in libraries, archives, and museums (LODLAM). It notes that LODLAM allows these institutions to explore data interoperability both within the cultural sector and more broadly on the web. The document outlines several outcomes of a LODLAM summit, including outreach, education, developing use cases, and examining issues around copyright and licensing of open data. Examples are provided of institutions that have published bibliographic and other cultural data using open licenses.
This document summarizes a presentation given by three librarians on the role of librarians in the intelligence process. It discusses the competencies and skills that librarians possess, such as open source intelligence collection, data and metadata management, knowledge management, understanding human information behavior, and instructional design. It argues that these enable librarians to take on new roles in intelligence work, including advising, analyzing, and teaching analysts. The document also outlines principles that allow librarians to collaborate effectively, such as assessing information quality, adding value through analysis, focusing communication, and having a mission focus of improving effectiveness through knowledge creation and application.
This presentation was provided by Scott Ziegler of Louisiana State University during the NISO Virtual Conference, Open Data Projects, held on Wednesday, June 13, 2018.
Open Access for Early Career ResearchersRoss Mounce
My talk for the University of Bath Open Access Week session; 23rd October 2013.
http://www.bath.ac.uk/learningandteaching/rdu/courses/pgskills/modules/RP00335.htm
The document discusses open opportunities related to open access, open data, open source software and repositories. It provides an overview of key concepts like open access to scholarly research, open government and open data policies. Examples are given of open data sources from the US government and other organizations. Free and open source software tools for data analysis and visualization are also described. The document closes by discussing open data repositories and ensuring data is openly accessible and citable.
Data Communities - reusable data in and outside your organization.Paul Groth
Description
Data is a critical both to facilitate an organization and as a product. How can you make that data more usable for both internal and external stakeholders? There are a myriad of recommendations, advice, and strictures about what data providers should do to facilitate data (re)use. It can be overwhelming. Based on recent empirical work (analyzing data reuse proxies at scale, understanding data sensemaking and looking at how researchers search for data), I talk about what practices are a good place to start for helping others to reuse your data. I put this in the context of the notion data communities that organizations can use to help foster the use of data both within your organization and externally.
1. The document summarizes a pilot project between Cambridge University Library and the University of Glasgow aimed at improving research data management through better advice, training, and support for researchers.
2. Interviews with researchers found issues with file organization, storage, preservation, data sharing, and a lack of useful guidance and training.
3. The project recommends producing simple data management guidance, practical training resources with discipline-specific examples, connecting researchers to support staff, and working towards a data management infrastructure.
Metadata and the Amount of InformationKen Fujiuchi
The document discusses the vast amount of data, or metadata, that exists both digitally and physically. It notes that in 2003, statistics showed over 1,600 terabytes of paper-based information existed across libraries and journals, equivalent to millions of books and newspapers. Digitally, over 1.986 exabytes of data existed on hard disks in 1999, and the total information flowing over the internet that year was 532.897 exabytes, over 100 times more than all words ever spoken by humans.
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...EUDAT
Going from a web of document to a web of knowledge is one of the key goal set by the creator of the World Wide Web, Sir Tim Berners-Lee. This dream is becoming a reality more each day with the development and the integration of new formats and new technologies to represent data as knowledge graphs, interlinking concepts within documents or databases together. This presentation will provide an overview of the generic concepts supporting Linked Data, including formats, the existing technologies supporting these formats, introduce the key existing initiatives relying on these technologies. We will also address the challenge of semantic/knowledge modeling in science and in other domains and the need for more tools to support the use of these technologies. In particular, we will present the semantic annotation service B2NOTE and how the formats and technologies are used to extend the description of datasets within EUDAT and allow the possibility to create new datasets from multiples sources and multiple domains.
Visit https://eudat.eu/eudat-summer-school
The document discusses Resource Description Framework (RDF) and its role in representing data on the Semantic Web. It provides examples of how RDF can represent relationships between resources through triples and graphs, and compares this to how the same information would be represented in XML. It also discusses RDF Schema (RDFS) and the Ontology Web Language (OWL) as languages used to build ontologies that can express richer relationships between resources on the Semantic Web.
Hannah is frustrated while making a powerpoint, Emily is making fun of Hannah while also working on a powerpoint herself. Jodie is watching Emily from the bed as she doesn't have a computer. The three friends are in Branson together on a trip. Jodie is getting married to Kyle on December 19th and asked Hannah and Emily to be her bridesmaids, who will wear red dresses.
Συμμετοχή του ΤΕΕ Ειδικής Αγωγής Α΄ Βαθμίδας -(Ειδικό ΕΠΑΛ Σερρών) στον Διαγω...KESYPSERRON
Συμμετοχή του ΤΕΕ Ειδικής Αγωγής Α΄ Βαθμίδας -(Ειδικό ΕΠΑΛ Σερρών) στον Διαγωνισμό Επιχειρηματικότητας 2015-2016 με τη δημιουργία της επιχείρησης Παραδοσιακή Οικοτεχνία
This document outlines the qualifications and skills of an applicant for a student leader position. It discusses their ability to facilitate talents through communication, recognize individual skills, and encourage sharing skills. It also outlines the applicant's experience in customer service, training, organization, and instilling a positive attitude. The applicant presents themselves as knowledgeable, informed, mindful of abilities, and able to provide a thorough understanding. They emphasize being detail-oriented, having time management skills, and ability to balance school, personal life and work. The document also discusses setting goals, encouraging others, upholding policy, and creating a positive environment through leadership.
The document lists 28 one week job opportunities across 28 Indian states and union territories. The jobs range from photographer in Haryana to cremation assistant in Uttar Pradesh to volunteer at an HIV + society in Meghalaya.
Stuart Laurence Salon has been operating in Charleston for 12 years and offers a full range of salon services with 7 stylists of varying experience levels. The salon uses Millennium 2005 generic salon software to manage client data, appointments, retail sales, services, referrals, stylist payroll and tips through an easy to use appointment book and data entry system. The software allows the salon to track a variety of client information including name, contact details, visit history, spending averages and more.
The document discusses various types of financial risks such as market, credit, and liquidity risks and the risks associated with equity prices, interest rates, foreign exchange, and commodities. It then examines tools and techniques for managing market risk, including the use of derivatives like forwards, futures, swaps, and options. Specific examples are provided to illustrate how these different derivative contracts work.
This document provides an overview and syllabus for a risk management course at the University of Economics in Kraków, Poland. The course will cover core concepts in risk management including the risk management process, financial risk, operational risk, and applications across different industries. It will involve 15 seminars taught through lectures, exercises and case studies. Students will be evaluated based on an end test, class participation, and a presentation. The document provides contact information for the instructor and references additional learning materials available online and in books.
Este documento rinde homenaje a la Dra. Marian Williams, coautora, colega distinguida y amiga leal de la tercera edición de Pruebas Funcionales Musculares. El documento presenta información sobre el desarrollo de las pruebas musculares manuales desde 1912 y los autores que contribuyeron a su evolución, incluyendo al Dr. Robert W. Lovett de la Escuela de Medicina de Harvard. También discute consideraciones básicas sobre las pruebas musculares y el análisis de la marcha como método de sele
Este documento resume três lançamentos musicais de Elvis Presley: "Girl Happy" de 1965, "Roustabout" de 1964 e "Kissin' Cousins" de 1964. Cada lançamento inclui a lista de faixas e um link para download. O documento também lista pedidos futuros de postagens sobre outros artistas.
Libraries are shifting from physical institutions to becoming more "borderless" networks as they adapt to linked open data structures. As libraries share data across the web through unique URIs and RDF triples, it creates a "web of data" that helps both humans and machines understand complex concepts. However, linked open data also faces challenges related to data discrepancies, copyright and privacy issues. All libraries and cultural heritage institutions will need to cooperate and adapt their data practices to fully realize the benefits of linked open data.
What's in and what's out? Invited presentation at workshop for the project: Standards for Networking Ancient Prosopographies: Data and Relations in Greco-Roman Names. King's College London, 31 March, 2014.
Talk at JISC Repositories conference intended for repository managers or research managers on some of the issues involved. Talk had to be originally given unaided because of a technology problem!
The document discusses the evolution of the concept of a web resource from early notions of static documents and files to a more abstract definition encompassing any entity that can be identified on the web. It describes how resources were initially implied to be addressable objects like files, but the definition has expanded to include abstract concepts identified by URIs. The document also examines how resources are described using RDF and the semantic web, the use of HTTP URIs to identify abstract resources, and issues of resource ownership and intellectual property.
These slides were presented at the "graph databases in life sciences workshop". There is an accompanying Neo4j guide that will walk you through importing data into Neo4j using web services form a number of databases at EMBL-EBI.
https://github.com/simonjupp/importing-lifesci-data-into-neo4j
Providing open data is of interest for its societal and commercial value, for transparency, and because more people can do fun things with data. There is a growing number of initiatives to provide open data, from, for example, the UK government and the World Bank. However, much of this data is provided in formats such as Excel files, or even PDF files. This raises the question of
- How best to provide access to data so it can be most easily reused?
- How to enable the discovery of relevant data within the multitude of available data sets?
- How to enable applications to integrate data from large numbers of formerly unknown data sources?
One way to address these issues to to use the design principles of linked data (http://www.w3.org/DesignIssues/LinkedData.html), which suggest best practices for how to publish and connect structured data on the Web. This presentation gives an overview of linked data technologies (such as RDF and SPARQL), examples of how they can be used, as well as some starting points for people who want to provide and use linked data.
The presentation was given on August 8, at the Hacknight event (http://hacknight.se/) of Forskningsavdelningen (http://forskningsavd.se/) (Swedish: “Research Department”) a hackerspace in Malmö.
Choices, modelling and Frankenstein Ontologiesbenosteen
This document discusses an ontology project at the University of Bristol. It addresses issues with representing research information, which changes frequently. The project uses a combination of ontologies like FOAF, Bio, and Dcterms to model "Things" like people and publications. Context about these Things, like time periods of validity, is represented using named graphs. The current implementation stores this information in a Fedora object store with RDF serialization. The project aims to gather relevant domain taxonomies and provide APIs for researchers to maintain them, taking a "Frankenstein" approach of combining relevant standards. It notes some design flaws of the CERIF interchange format compared to the linked data approach taken.
Linked Data for Law Libraries: An IntroductionEmily Nimsakont
This document summarizes Emily Dust Nimsakont's presentation on linked data for law libraries. She began by defining linked data and its key aspects, such as using URIs to identify things and linking data from different sources to connect and query it. She explained the principles of linked data using RDF graphs and triples. Nimsakont discussed benefits of linked data for libraries, such as new ways of searching and applications using structured data. For law libraries specifically, linked data can help address challenges of heterogeneous and changing legal information. She provided examples of existing linked open data sources and encouraged libraries to publish data following linked data best practices.
This document summarizes Sebastian Heath's presentation to the NEH-funded Linked Ancient World Data Institute on using linked data. It discusses recognizing the heterogeneity of existing digital humanities data and avoiding overly prescriptive standards. While complex, useful interoperability can still be achieved through linking data using URIs and RDF. Stable web addresses are important to create reliable links between related conceptual entities.
Internet Archives and Social Science Research - Yeungnam Universitymwe400
The document discusses using large datasets from the Internet Archive to conduct social science research on emerging organizational forms. It presents examples of previous research leveraging archive data on topics like natural disasters, political activity, and social movements. The author proposes analyzing hyperlink, news coverage, Twitter, and website data on the Occupy Wall Street movement to test hypotheses about its emerging networked structure over time. Results are presented showing the growth of the movement's online presence and core clusters within its organizational network.
Huygens colloquium at Radboud University Science Faculty.
Effective web search engines (and the commercial success of a few internet giants) depend upon the data collected from the online seeking behaviour of huge numbers of users. Put differently, the high quality search results we accept for granted every day come at the price of reduced privacy.
A personal search engine would not only search the web, but also rich personal data including email, browsing history, documents read and contents of the user’s home directory. Results with so-called "slow search" indicate that the user experience can be improved significantly when the search engine gains access to additional data. However, will we be prepared to give up even more of our privacy, and eventually be prepared to give up control over all that personal information?
My proposal is to mitigate these concerns by developing a new architecture for web search, in which users control the trade-off between search result quality and the privacy risk inherent to sharing usage logs. Under this design, all data of the “personal search engine” (PSE) (web and usage data) resides in its owner’s personal digital infrastructure.
Two challenges need to be overcome to turn this into a viable alternative. Can we compensate for the loss of information about searches of large numbers of users? And, can we maintain an up-to-date index in a cost-effective manner? As a solution, I propose to organise personal search engines in a decentralised social network. This serves two goals: the index can be kept up-to-date collaboratively, and usage data may be traded with peers.
Objectification Is A Word That Has Many Negative ConnotationsBeth Johnson
Here is an introduction to social web mining and big data:
Social web mining is the process of extracting useful information and knowledge from social media data. With the rise of big data, social media platforms are generating massive amounts of unstructured data every day in the form of posts, comments, shares, likes, etc. This user-generated data holds valuable insights about people's opinions, interests, behaviors and more.
Big data analytics provides tools and techniques to analyze this large, complex social data at scale. Social web mining applies data mining and machine learning algorithms to big social data to discover patterns and relationships. Areas of focus include sentiment analysis to understand public opinions on brands, products or issues; network analysis to map relationships and influence; and
This document discusses the OpenART project which aims to expose structured metadata from the "London Art World 1660-1735" dataset as linked open data. The project is analyzing the dataset, modeling it using ontologies, and creating RDF triples with unique identifiers to describe entities like art sales, people, places, and artworks. By publishing this structured data on the web using semantic web standards, it will enable others to more easily discover and integrate this information into their own applications and research.
Short current citations and a future with linked dataIliadis Dimitrios
This document discusses current citation practices and the potential benefits of converting citation data to linked open data. It begins by defining citations and references, then outlines some advantages and disadvantages of traditional citation analysis from the perspective of users and researchers. It introduces the concept of linked data and describes how citation data could be transformed into RDF triples linked to other semantic datasets. This would allow for more complex searches and analysis of citation networks. The document provides an example of how a citation could be represented as linked data and discusses advantages such as improved interoperability and the ability to determine why a work was cited.
Dave Clarke presented on knowledge organization and discovery. He discussed how knowledge management requires systematically organizing knowledge through standards and software. Effective organization allows for knowledge discovery, though full-text search has limitations without understanding context. Knowledge organization systems address this through formal subject indexing schemes linked to controlled vocabularies and external data sources. Several examples showed how semantic annotation and linking to external ontologies enhances discovery of conceptually related information that could not be found through search alone.
Presentation delivered at the Linked Ancient World Data Institute, Drew University, 30 May 2013.
Copyright 2013 New York University.
This work is licensed under a Creative Commons Attribution 4.0 International License.
http://creativecommons.org/licenses/by/4.0/deed.en_US
Funding for the preparation and presentation of this presentation and the workshop at which it was presented was provided by the National Endowment for the Humanities. Any views, findings, conclusions, or recommendations expressed in this presentation do not necessarily reflect those of the National Endowment for the Humanities.
EXPLORING THE USE OF GROUNDED THEORY AS A METHODOLOGICAL.docxssuser454af01
EXPLORING THE USE OF GROUNDED THEORY
AS A METHODOLOGICAL APPROACH TO
EXAMINE THE 'BLACK BOX' OF NETWORK
LEADERSHIP IN THE NATIONAL QUALITY
FORUM
A. BRYCE HOFLUND
University of Nebraska at Omaha
ABSTRACT
This paper describes how grounded theory was used to investigate the
“black box” of network leadership in the creation of the National
Quality Forum. Scholars are beginning to recognize the importance of
network organizations and are in the embryonic stages of collecting and
analyzing data about network leadership processes. Grounded theory,
with its focus on deriving theory from empirical data, offers researchers
a distinctive way of studying little-known phenomena and is therefore
well suited to exploring network leadership processes. Specifically, this
paper provides an overview of grounded theory, a discussion of the
appropriateness of grounded theory to investigating network
phenomena, a description of how the research was conducted, and a
discussion of the limitations and lessons learned from using this
approach.
Keywords: grounded theory, network leadership, health care, network
organization, collaboration
470 JHHSA SPRING 2013
It is a capital mistake to theorize
before one has the data.
- Sherlock Holmes
The task of scientific study is to lift the veils
that cover the area of life that one proposes to study.
-- Blumer
(1978)
Generating a theory involves a process of research.
--Glaser and
Strauss (1967)
In The Rise of the Network Society (2000), the first
in a trilogy of books about the social, economic, and
cultural impacts of the Information Age, sociologist
Manual Castells documents the rise of the Information Age.
A defining feature of this new age is interconnectedness,
which is manifested through the complex networks that are
a ubiquitous part of the Information Age. Networks are
everywhere; there are, among other things, global business
networks, cellular networks, television networks, social
networks, the Internet, and computer networks.
In the public sector we also are witnessing the
movement away from bureaucratic, hierarchical
organizations toward networks. Rubin (2005) argues that
the three-branch metaphor for government is outmoded and
that the network metaphor more accurately describes
government and intergovernmental relations today.
Goldsmith and Eggers (2004) note that this shift has
occurred for a number of reasons, including an increase in
cross-agency and cross-government initiatives, an increase
in public-private collaboration, and the growth of the
Digital Revolution, which allows for increased citizen
demand for and input in service delivery options.
JHHSA SPRING 2013 471
In 1999 the health care industry created the National
Quality Forum (NQF), a network organization, whose
founding mission was to improve American healthcare
through endorsement of consensus-based national standards
for measurement and public ...
Introduction to Ontology Concepts and TerminologySteven Miller
The document introduces an ontology tutorial that will cover basic concepts of the Semantic Web, Linked Data, and the Resource Description Framework data model as well as the ontology languages RDFS and OWL. The tutorial is intended for information professionals who want to gain an introductory understanding of ontologies, ontology concepts, and terminology. The tutorial will explain how to model and structure data as RDF triples and create basic RDFS ontologies.
PBW: Possible Futures and technical directionsJohn Bradley
How the Prosopography of the Byzantine World could fit with a "digital ecosystem" to support Byzantine Studies. Invited presentation at full day seminar PBW and its place in Byzantine Scholarship. Keble College, University of Oxford. 7 October, 2013.
Similar to Towards an Ontology for Historical Persons (20)
Towards a bibliographic model of illustraions in the early modern bookJohn Bradley
This document discusses developing a bibliographic model to better represent illustrations in early modern illustrated books. It summarizes key points about illustrations being overlooked historically but having a complex relationship with text. It proposes a model with two FRBR stacks for text and images linked by a "set" to capture variability between book copies. This would better represent books as individual physical objects while distinguishing images and text as distinct cultural objects.
Capturing and expressing REED's (Records of Early English Drama) essence in a digital future. Given at Envisioning REED in the Digital Age 4-5 April, 2011. University of Toronto
This document discusses digital tools for humanists and their impact. It examines Douglas Engelbart's vision of augmenting human intellect with computers. While tools like Pliny aimed to help with tasks like annotation and note-taking, they have had little uptake by humanists. Reasons for both the success and failure of digital tools are considered, such as whether they address the actual work of humanists, their usability, and whether they reach the right audience. The document also references debates around what constitutes scholarly work and the role of interpretation in research.
The document discusses the connections between annotation and scholarship in a digital context. It explores how digital annotation differs from pre-digital annotation and the new possibilities it offers to humanities scholarship. It discusses early conceptions of annotation on the web by Berners-Lee and how annotation was almost featured in the Mosaic browser. It also examines scholarly practices like reading, notetaking, and how annotation serves as a nexus between these activities and the writing process. Finally, it discusses how digital tools can support annotation and scholarship throughout the research process.
This document discusses different perspectives on digital humanities. It partitions digital humanities into four areas: traditional scholarship about digital things, data analysis using digital tools, data representation using digital tools, and making digital tools. Each area is then briefly described, with examples provided. The document also discusses how digital tools and techniques are being applied in humanities research processes and outputs.
Tools for a whole range of Scholarly Activities (at DH2015)John Bradley
The proposal introduces a simple classification scheme for digital tools for the Digital Humanities, and explores how the classification scheme introduces issues about tool building in the DH.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
1. Towards an Ontology for
Historical Persons
John Bradley
Department of Digital Humanities
King’s College London
john.bradley@kcl.ac.uk
2. Tim Berners-Lee on Linked
Data
All kinds of conceptual things, they have names now that start with
HTTP.
I get important information back. I will get back some data in a
standard format which is kind of useful data that somebody might
like to know about that thing, about that event.
I get back that information it's not just got somebody's height and
weight and when they were born, it's got relationships. And when it
has relationships, whenever it expresses a relationship then the
other thing that it's related to is given one of those names that starts
with HTTP.
Tim Berners-Lee: Linked Data presentation at TED 2009
2
3. Linked Data and History
If linked data is to connect historical data,
it is likely to work best when centered on
three kinds of entities:
Sources
Places
People
3
4. Prosopography as Linked
Data
“A particular prosopography aims to amass and present clearly a
quantity of information on all individuals in a given category” (PASE
website)
Prosopography has traditionally been a linked data-like exercise
4
SourcesPeople
From J.R. Martindale, The
Prosopography of the
Later Roman Empire, 3:
A.D. 527-641. Cambridge:
Cambridge University
Press. 1992.
Places
6. Person Identity: URIs
URIs provide an excellent model for
identifying persons globally
PBW “URI”:
http://db.pbw.kcl.ac.uk/pbw2011/entity/person/143353
6
7. Same person: multiple
URIs
Linked Data/Semantic Web can even
accommodate separate URIs for the same
person:
7
owl:sameAs
owl:sameAs
owl:sameAs
http://www.pone.ac.uk/record/person/12/http://db.poms.ac.uk/record/person/2046
http://www.oxforddnb.com/view/article/22966
8. Prosopography: more than
“just” person identification
Historical persons survive for us through their
appearance in sources, and historians identify them not
only by their name, but also by what they did and by
other ways that they are described.
8
9. Prosopography and the
linked Data Principles
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those
names.
3. When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL)
4. Include links to other URIs. so that they can discover
more things.
(Berners-Lee 2006: http://www.w3.org/DesignIssues/LinkedData.html
9
10. From “Closed” to “Open”
Prosopography
Closed: single research team, contained
domain, controlled semantics, tight boundary
Open: collaboration between partners, fuzzy
boundaries, multiple overlapping interests
Examples:
POMS and PONE
PASE to “PASEN”
PBW to “Crusades”
10
11. PASE->”PASEN”: the move
from closed to open data
11
PASE
Anglo-
Saxons
Anglo-
Normans
Normans 1
Anglo-
Normans
Other
people
Anglo-
Normans
Other
people
Normans 2
Normans 3
Anglo-
Normans
Other
people
The linking of people
is only a part of the
issue:
The linking of data
about the people each
project holds also
needs to be thought
about
Boundaries between projects not
necessarily so clear-cut
12. Existing data models for
prosopography
12
DDH: “factoid Model”
PBE/PBW
PASE
POMS
PONE
Charlemagne
DDH: Clergy DB Model
FOAF
13. OHP and other models
13
DDH: “factoid Model”
DDH: Clergy DB Model
Ontology for Historical Persons
FOAF
Inference Layer
14. FOAF: Friend of a Friend
“FOAF is a project devoted to linking people and information using the
Web. Regardless of whether information is in people's heads, in physical or
digital documents, or in the form of factual data, it can be linked.”
“FOAF does not compete with socially-oriented Web sites; rather it provides
an approach in which different sites can tell different parts of the larger
story, and by which users can retain some control over their information in a
non-proprietary format.”
14
http://xmlns.com/foaf/spec/
15. OntoLife:
Personal knowledge management
“model life by describing a person’s
Characteristics
Relationships
Experiences”
15
Kargioti, Eleni (2009). OntoLife: An
Ontology for Semantically Managing
Personal Information
16. TEI: Names, Dates, People
and Places
“... this module allows one further to represent a personal name, to
represent the person being named, and to represent the canonical
name being used. A similar range is provided for names of places
and organizations. The main intended applications for this module
are in biographical, historical, or geographical data systems such as
gazetteers and biographical databases, where these are to be
integrated with encoded texts.”
TEI, section 13 introduction(http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html)
16
17. TEI: Personography: “Basic
Principles”
Information about people, places, and
organizations, of whatever type, essentially
comprises a series of statements or assertions
relating to:
characteristics or traits which do not, by and large, change over
time
characteristics or states which hold true only at a specific time
events or incidents which may lead to a change of state or, less
frequently, trait.
• TEI, section 13.3.1 (http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html)
17
18. TEI: Personography textual markup:
Marriage of William Morris
Persons identified
by <person> tag
References to
people in text
tagged with
<name>
An event tagged in
the text with
<event>
No roles for people
in event specified
18TEI, section 13.3.2.2 (http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html)
19. Core structure for DDH’s
Prosopographical databases
PersonPerson
AssertionAssertion
Authority ListsAuthority Lists
Assertion TypeAssertion Type
SourceSource
LocationLocation PossessionPossession
19
Instance of
Typed by
Connected to
Connected to
Appears in
Connected to
RoleRole
DateDate
24. Building the OHP
Needs to be a collaborative venture
I have begun to talk up the idea
If there is interest, a workshop to explore
it and develop ideas could be set up at
King’s in London.
Comments??
24
We at the Department of Digital Humanities – DDH – at King’s College London have worked on digital prosopography for years now – my second substantial DH project when I joined then CCH in 1997 was the Prosopography of the Byzantine World. Since then, I and my department have been involved in the development of 5 other prosopographical projects: for Anglo-Saxon England, for the Clergy of the Church of England, two for northern Britain, one for Charlemagne’s Europe. As you will hear, based on this experience, I believe that prosopographical thinking has something specific to offer to the world of linked data. In my talk today I am outlining, for the first time publically, some thoughts about what I&apos;d like to turn into some kind of the project: the development of an Ontology for Historical Persons.
It seems to me that Linked Data has developed out of the Semantic Web as a kind of simplification of its goals. Here is a quote from a talk Tim Berners-Lee gave at the TED 2009 conference about Linked Data which outlines the key ideas behind Linked Data:
that one names things (not only web pages) by giving them URIs (like URLs) that start with http.
that if you give one of these to the WWW, you get back useful information in a standard format (elsewhere, the format is specified to be the structured-data oriented RDF format)
and, that the data you get is situated in a digital world that is linked to other pieces of data
I am a digital humanist and it is natural, then, for me to think of Linked Data in the context of the humanities. In the context of digital history, linked data has the potential of enabling a much richer use of digital historical data created by historians around the globe, and it could create a new digital global eco-system for carrying out historical research.
One would expect that historians will be interested in approaching this – still imagined – web of linked data historical resources primarily from three different starting points:
Historians are likely to turn to the web – indeed, already do so when they can, to locate and access good quality textual sources. The preparation of texts for scholarly use is a highly specialised activity and one can expect there to be centres of expertise all around the world that would focus on the scholarly task of preparing and publishing editions of the sources. One can well imagine URIs being assigned to digital sources that operate as identifiers as a matter of course.
Information about place is also important to historians, and often provides a bridge to the interests of the general public. Here too, specialist knowledge is necessary to prepare and organise data about historic places. The sorting and management of historical places is already happening. The Pleiades project (http:pleiades.stoa.org) provides us with an excellent example of how to structure and make available information about historic places. Their website tells us that they have close to 100,000 places, names and locations in their database, and that it provides &quot;extensive coverage for the Greek and Roman World&quot;.
The third perhaps most obvious &quot;entry point&quot; to historical data would be via persons. And here we come to a linked-data perspective on prosopography – what I’m talking about here.
This, rather informal definition provided by our Prosopography of Anglo-Saxon England -- PASE – captures the essence of a prosopographical project. It says that prosopography is a kind of historical study that &quot;aims to amass and present clearly a quantity of information on all individuals – people – in a given category&quot;: for PASE this was Anglo-Saxon England. To build PASE, its researchers read a substantial number of Anglo-Saxon sources, more than 2700, ranging in size from a short legal charter, up to the Anglo-Saxon Chronicles and the 1086 – post Norman conquest – Domesday book. They worked to identify all the people named in these various sources, and recorded information about them in a database. Each name in the source thus turned into a reference to a person, and PASE becomes a prosopography: aiming to provide a definitive list of people that appear across all the surviving Anglo-Saxon sources.
Even before prosopography became a digital activity as it was for PASE, it is worth noting for a moment that it has always been in some sense a linked data kind of activity. Here is an entry for a person – Eucherius 4 – who appears in John Martindale&apos;s Prosopography of the Later Roman Empire, volume 3 – a prosopography that appears in print, and dates from 1982. As is the case in all the other entries about people that Martindale has identified, a person is given a standardised name (arguably like Linked Data&apos;s first URI requirement), information is provided about him when one looks him up (the second requirement), and the information is richly linked between sources, to places and other people (the 3rd requirement). Here we see, then, exactly the three key &quot;historical entry points&quot; I mentioned in the last slide: historical sources, people, and places.
Prosopography, as a project that aims to identify historical persons, is widely done as a kind of primary goal in the digital humanities: I show here the web pages for several of &quot;our&quot; projects: The Prosopography of the Byzantine World, the Clergy of the Church of England Database, as well as PASE (already mentioned). I couple “our” projects with a few of other mainly prosopographical projects that I know about: the Prosop project of Will Hanley (Florida State University) which aims to create &quot;a large database of historical names&quot; which will hold data contributed by others, Caroline Bawden&apos;s &quot;Who were the Nuns&quot; project which aims to &quot;identify those [Catholic] women who entered the English convents from the foundation of the first new house in Brussels in 1598 until the end of the exile period (in England)&quot;, Alison Booth&apos;s &quot;Collective Biographies of Women&quot; which is creating what she describes as “an annotated bibliography of English-language books that collect three or more short biographies of women only: a forgotten British and American publishing tradition that provided a surprisingly ample and wide-ranging biographical history of women&quot;, and the Orlando project which &quot;is an online cultural history generated from the lives and works of over 1200 [British woman] writers&quot;. Note the temporal and cultural overlap between, say, Orlando and CBW: an issue I shall return to shortly.
Other projects, although not perhaps primarily prosopographical, take on some degree of prosopography “on the side”. The public website that gives access to the Merlin database at the British Museum, for example, links and identifies people associated with the objects it describes. Indeed, I have included a simple illustration of people in a CIDOC-CRM structure – the same one Martin Doer showed us yesterday – because the CRM, although aimed at organizing data about cultural objects, clearly identifies people associated with these cultural objects as a part of the cataloguing process.
There must be thousands of projects which publish their results on the WWW that have an historical prosopographical focus, or at least an historical prosopography component.
Prosopography aims to uniquely identify a person, and when the prosopography is structured and digital, this unique identification maps naturally into a URI. Indeed, as a result, as we have been revising our web presence for our prosopographies we have been turning these unique identifiers we already have for people into direct RESTful links directly to the data that the project holds about the associated person, thus allowing these links to act as URIs that identify that person. Here we see the RESTful URI to our Byzantine World project that links to, and therefore identifies, one particular PBW person: a certain Kallinikos who has a hegoumenos at Athos.
The &quot;AAA&quot; principle described by Allemang and Hendler in their book &quot;Semantic Web for the Working Ontologist&quot;: that in the WWW anyone can say anything about any item, is extended by them to recognise that the existence of a global URI to identify a person doesn&apos;t necessary mean that there is only one global URI for that person: in the WWW different project can legitimately define their different URIs for the same person.
Here we see three different URI references for the same person Roger de Quincy, and two of these IDs for de Quincy – in the People of Medieval Scotland and People of Northern England databases – are from scholarly related projects. Indeed, there is a formal linking mechanism is in their separate databases precisely so that the project teams can assert when entries in the separate projects refer to the same person. Also, one of the projects provides a mechanism to link to the online Dictionary of National Biography. I have identified these formal links as a kind of &quot;owl sameAs&quot; connection.
Although the identification of persons is, obviously, a key part of what prosopography is about, and hence, as linked data, the assignment of URIs for people is an obvious part of digital prosopography (indeed, its first principle), we need to think about the other parts of the linked data principles as well.
The 2nd principle: that when one uses the URI to fetch something from the WWW that useful data comes back: is also a key idea of prosopography. Since the only information we have about historical people comes from how they are presented in the historical record, the collection of information we have about them that has been gleaned from those records is arguably a key part of their actual historical identity.
At present, in our projects at least, invoking the URI to get the data from our structured prosopographies only delivers a web page that presents the data for a human reader through a browser – however, the structured nature of the data behind these dynamically created web pages is entirely compatible with the ways of thinking about data that is present in linked data’s RDF – indeed, we have been recently exploring how to best map our PASE data into RDF structures.
Thus, from a linked data perspective, where the data is presented as RDF and assumed, at least, to be highly structured, the formal nature of this data is therefore both technically and historically as tightly connected to the identity of the person him- or her-self as the URI is.
Our thinking about prosopography for linked data needs minimally to accommodate all four of Berners-Lee&apos;s linked data principles. However, these principles, by themselves, only deal with a part of the problem that arises if one thinks about historical structured data linked together but created by semi-independent projects around the world. Indeed, I believe that once one recognises that different teams of people can say different things about the same historical person (Allemang and Hendler’s AAA principle again: “anyone can say anything about any topic”) one has to go beyond these 4 basic linked data principles to bring in some of the other ideas that come out of that other major component of the semantic web: computer ontologies.
Perhaps the Web, with its ready access to material from all over the world already makes us more aware that separately conceived projects would be most useful if somehow the data they contain could fit together. This is most definitely an issue that is brought into sharp focus with prosopography. A single individual can appear in more than one historical context where he or she could well be represented quite differently. We might expect this phenomenon to occur with prominent people such as Alexander the Great, but who would have thought that a less well known person such as Harold Harthrada – in our Anglo-Saxon project – also makes an appearance in Byzantium where our PBW project operates?
Furthermore, this AAA phenomenon is beginning to transform even how projects, conceived as connected as a single research umbrella are beginning to operate. I characterise here how over the past few years what has been happening. The important thing is that our various projects that in the past would often be characterised by being conceived as coming about from the labours of a single research team have begun to shift to a much more open, multi-player collaborative context. Whereas these older projects often worked hard to define a closely defined boundary – a prosopography of the Anglo-Saxons, or the English clergy, and worked hard to define a suitable, but largely self contained, conceptual framework in which to operate; our partners have begun more recently to talk in terms of much larger research ventures characterised by teams of more independent researchers, often with fuzzy boundaries between their interests, but with research interests which, although they might connect together in some ways, also represented different and even not-fully-compatible viewpoints.
Three brief examples (expanded in presentation...)
A second example: The move from PASE to what I call here “PASEN” is even more striking. PASE – our Anglo-Saxon prosopography, ends more or less with the creation of William the Conqueror’s Domesday book. It provides information about Anglo-Saxons both before the Norman conquest in 1066 and also after it in 1086, but also contains a rich set of data about the Anglo-Normans who gradually took over all elite positions in the English Kingdom. Domesday book provides, then, a kind of transitional document between an Anglo-Saxon-oriented project that has been PASE and that covers about AD 600 to about 1086, to an Anglo-Norman period that begins with the Norman conquest in 1066.
PASE&apos;s Domesday historian Dr Stephen Baxter is a good example of somewhat who has become caught up in this transition. On one hand, he has continued to work with some of the thorny issues that arise out of the identification of Anglo-Saxons in Domesday book. However, he has more recently begun to develop a new research agenda taking up the history of the Normans. Clearly, work on the Normans takes us outside of England and Anglo-Saxon Scholars to areas of France and even other areas in Europe where the Normans were important and that involve different, Norman studies, scholars. Indeed, any Norman individual can only partly be understood by considering his or her role in the post-conquest England, and many can only partially be understood by focusing on their activities in continental Europe. Furthermore, there are already well established centres for scholarship for the Normans in France and elsewhere that focus on their historical role in other places outside of England. The project becomes, by its very nature, a much more collaborative, and diverse, one.
Thus, although Stephen and I originally jokingly referred to the extension of PASE to include more about the Normans in England to require the change in name of the project from PASE – Prosopography of Anglo-Saxon England – to PASEN – Prosopography of Anglo-Saxon England and the Normans&quot;; in reality the story of future work (and the conception currently being worked on in project proposals) would need to recognise a much more open, collaborative venture than PASE needed – bringing in experts with a focus outside of England with those who focus on the Anglo-Normans.
From a linked data point of view, PASE would become one partner among several Norman-oriented projects. Entity domains will overlap between projects not only regarding people, but also overlapping for sources and places; each project could have their separate but linked list of Normans active in their area, each one with different non-Norman people also involved, and – importantly for my discussion today – each with a different sense of the kinds of data they were collecting about their people.
The question of different project data organised according to different structures is one that must confront any project that aims to bring material together from different projects, such as this imagined Norman of crusades project. Although our Norman project discussions are not far enough advanced so that we know what data is being collected by the different collaborators, we already do know that different prosopographical projects use different data models upon which to base their data structures. Are there common concepts that operate arguably across prosopography that help?
We ourselves at DDH have been involved in a number of digital structured prosopographies that cover a broad range of cultures and time periods, and most of them use data models that are based on our so-called &quot;factoid&quot; model, which I will introduce briefly in a couple of moments. However, not everyone doing prosopography can be expected to use our factoid model: indeed, even our very own Clergy of the Church of England project does not use the factoid model directly. Indeed, a look around at other projects outside of our collection of prosopographies shows the range of data models that have been applied.
See, for example, the Digital Prosopography of Renaissance Musicians, who based their model for data about their musicians around FOAF with some extensions. The Orlando project uses an XML tagging system for their biographical documents that is only loosely connected to the well known TEI markup scheme, and was developed specifically to deal with the kind of events that they found to appear in the lives of the women authors they were working with. Ralph Mathisen, in his 2007 article in Katherine Keats-Rowan&apos;s collection &quot;Guide to the Principles and Practice of Prosopography&quot; shows us the input screen for the Attica Website, and therefore reveals the relatively simple structure behind that prosopography.
Then, we have OntoLife: an ontology that is focused on the assembling of data for what it calls &quot;Personal Knowledge Management&quot;, and has been seemingly created for representing information about a modern-day professional individual: with attributes for information like Medical history, Work Experience, Language Skills, place of birth., etc.
Finally, again we see CIDOC-CRM with its structure for Actors and events in which they took part. Anyone applying its model to historical figures must be doing prosopography to some degree.
It is striking that if we were able to pull RDF data directly from the differing structures behind each of these projects (Linked Data’s 3rd principle) they would be likely dealing seemingly with common semantic entities: persons, places, activities, etc – but their detailed structure would not be immediately compatible. Can anything be done about this?
Perhaps we can begin to see the need for a common conceptual framework which allows us to associate these different kinds of data. Like the CIDOC-CRM which provides a common framework for information about cultural artefacts, an Ontology for Historical Persons would provide a common conceptual framework for data about historical persons. In the same way that at least much of the data from any particular cultural artefact management system could then be mapped to the common concepts of the CIDOC-CRM because the CRM represents a broadly-agreed understanding of the semantics of data held by cultural institutions about their collections, perhaps the data from these different prosopographical projects could also be mapped to the common concepts of an OHP.
We can perhaps see two questions immediately:
Is there a common conceptual framework that applies across prosopographies? And
Is there already an ontology in existence that deals with this framework?
There are, of course, already a number of ontology, or ontology-like models that inhabit a domain space that is similar to the one the OHP would work with. Will any of these already do?
Perhaps the most obvious semantic web prosopography ontology is FOAF: Friend of a Friend. Although FOAF really has a modern social web domain, with attributes for person related to what any thoroughly modern digital citizen would want, it is nonetheless occasionally used as the basis for modelling historical and semi-historical persons too – remember the &quot;Digital Prosopography for Renaissance Musicans&quot; I mentioned earlier, for example. To me, too many of FOAF’s attributes reflect assumptions about modern life in the digital and internet world. With FOAF being used so widely in spite of this, however, I would hope that wherever possible, any similar OHP entities would be explicitly related to FOAF properties and classes through subclassing or equivalence.
OntoLife – an ontology for representing &quot;personal information&quot; has a broader domain included than FOAF, but still seems to have a focus on a modern perspective on a person&apos;s life. A history project that involved telephone numbers, say, or identity cards, or skills and qualifications, is likely to be a 20th century one, and might also one that, like FOAF, is likely to focus on people in their professional life. OntoLife might be useful here.
However, as much as I liked the overall categorization of the kinds of information organised within the OntoLife ontology as representing a person&apos;s &quot;characteristics, relationships [to others as individuals, and to organisations], and experiences&quot;, OntoLife feels like a rather inadequate basis for an ontology meant to deal with a broad range of historical periods.
However, although OntoLife doesn&apos;t seem to be an appropriate model to form the basis for the OHP, one would want to allow compatible concepts to be mapped to it – in the same kind of way that any OHP would need to be able to map appropriate ideas to FOAF too.
Having looked at two prosopographical-like models that did not emerge from the humanities, we should turn briefly to consider the greatest structured data initiative in the Digital Humanities: the Text Encoding Initiative – or TEI – because like many fields in the digital humanities, it has something to offer here! What does it have to say about digital Prosopography? Although TEI primarily focuses on the issues in marking up texts for textual scholarship, it does also venture into the representation of non-textual items.
The TEI&apos;s chapter entitled Names, Dates, People and Places brings a stronger historian’s perspective to the representation of material about historical person than FOAF or OntoLife. Although the chapter starts off very usefully focusing on the relationship between names and people – definitely a part of the task of a prosopographer (although only a part of it, as I hope earlier parts of this presentation has made clear), section 13.3.3 is entitled &quot;Biographical and Prosopographical Data&quot;, and in its introduction it says that it is aimed at researchers &quot;creating or converting an existing set of biographical records, for example of the type found in a Dictionary of National Biography&quot;, or creat[ing] ... a database-like collection of information about a group of people, possibly but not necessarily the people referenced in a marked-up collection of documents&quot;.
The TEI&apos;s biographical and prosopography section begins with a categorization of the kinds of information that this TEI module is meant to support. Its characterisation of the three kinds of information about people connects strongly with our own experience – going back a number of years – in our “DDH prosopographies” for the kind of data that our prosopographers want to work with, and challenges the assumption that I have heard recently that &quot;event-driven&quot; models cover all the needs of prosopography. Traits and states also represent information that historical sources will record about people and many of these do not really map into an event-centered model.
Later in the guidelines we find an example of the application of TEI&apos;s prosographical module that marks up a bit of text that describes the marriage of the artist William Morris to Jane Burden in 1859.
I particularly like the fact that through the use of the event tag TEI provides the recognition that information about historical sources is usually found, and grounded in, textual sources (although it isn&apos;t made clear what the underlying text in this particular example is). There is a clear separation of the person from the name of a person, and we can see here names-as-reference-to-a-person, by being nested inside an event tag, establishing formally the connection of the people named there to the marriage event. The conjunction of source, event, and people that happens in this markup resonates well with our own model for structured prosopography as you will see in a moment. Perhaps the most striking thing missing, however, is that there is no sense of roles for the people being recorded: no one is formally identified as the bride or groom in the structure for example.
In sum, there are lots of good ideas that belong in an Ontology for Historical Persons in the TEI guidelines, and because they are in the TEI we can be pretty sure that they will work well with established scholarly text study practice, and are more likely to be recognised as significant by the digital humanities community.
Now it is time to describe the model for structured prosopography that we have followed at DDH for almost all of our prosographical projects. The &quot;factoid&quot; or &quot;source assertion&quot; model has been used successfully by us in a range of prosopographical projects since the development of the Prosopography of the Byzantine Empire in the 1990s, and although there have been some significant enhancements to the model since it emerged during the creation of PBE, several of its key ideas are still with us.
The central idea in it is the &quot;Source Assertion&quot; – formally called the “factoid”. This can be best thought of as an item that represents a spot in a source where something prosopographical happens: where the source makes an assertion about a person or persons that the prosopographer wants to record in his or her data. The assertion might be as simple as giving an historical person a title or naming him or her has holding an office. It might involve more than one person: asserting a relationship between two people, for example that Elizabeth Ist is daughter of Henry VIII. It might represent a more complex thing such as an event in which several people are involved: in the text we have just seen about the marriage of William Morris to Jane Burden more than two people are mentioned has having been involved – and in our source-assertion model, various people involved can be given roles in the event. Any of these assertions can also bring in information about associated geographic places: the event happened at a particular place, or possessions, and, of course, all assertions can have an historical date or range associated with them.
The point of the “source assertion” is that it represents a kind of nexus between a segment of an historical source, a group of one or more people, some geographic places and possibly some possessions. Like the TEI personography module, the assertion is not necessarily an event: it could be an assertion that a person simply was bishop or king: what TEI calls a &quot;trait&quot; or &quot;state&quot;.
We can see factoid/source assertions present even in traditional prosopography. Here we see again Martindale&apos;s article about Eucherius 4 again, and the assertions that he makes, and links to historical sources here labelled up as factoids, or &quot;source assertions&quot;. The final one in Martindale’s short article is shown below it in structured form. We can see the factoid linking two people, with roles attached to them to a spot in a source through its description of a particular relationship and event between them.
In this article this figure came from Michele Pasin and I looked at elements of the factoid model that mapped onto parts of CIDOC-CRM. Here you see one of the transitional diagrams that eventual lead us to a representation of our factoid approach, in part at least, in terms of CIDOC-CRM&apos;s Classes and Properties.
One of the things that came out of this work was the realisation that our model provides two levels of assertion: at one level, in essentially modern times, our prosopographer asserts that a source from historical times says something. The thing the source says is a second layer of assertion that sits in front of the assertion itself. In this diagram, then, we can see that Martindale – our modern-day prosopographer – asserts that Gregory of Tours in his Historia Francorum asserts that Victorius 4 imprisoned Eucherius 4. This &quot;double assertion&quot; construct makes it clear about how our factoid/source assertion model combines the characteristic of being &quot;source driven&quot; with being a prosopographical interpretational act.
Finally, I’ve already mentioned that CIDOC-CRM has prosopographical components, and that people who use the CRM’s model to represent their collection have the capability to assert things about historical persons. What would CIDOC-CRM offer to a model for Historical Persons too?
I can’t say I know for sure! At first glance, perhaps, the CRM would appear to largely model activities that are involved in the collection and management of cultural objects – the kind of work done by archives and museums, rather than historians, and, indeed, many of the classes and properties in the CRM serve exactly that purpose. However, the fact is that the CRM crosses the boundary here and there between cultural objects and historical persons, probably because as I mentioned earlier, museums and archives do; and we can see this in the modelling of images, documents and people involved in the Yalta conference at the end of World War II – all represented in terms of the CRM’s entities.
As a result, the CRM has model elements that would play a significant role in an OHP. Its Appellation model, its modelling about existence, and its model of temporal relationships are obvious candidates.
So, what kinds of things belong in an Ontology for Historical Persons? Here we see a deliberatively rather crude illustration, only partly formal, that touches on the kinds of things that an OHP might deal with.
The three ovals at the top would be ontologies, and would hence be primarily be made up of classes and properties. The three boxes at the bottom represent three hypothetical repositories for historical things: of People, of Places, of Sources. The three repositories would likely be maintained by different scholars, and could presumably use the ontologies in the top area of the figure to formally define some of the semantics for their particular pieces of data.
Note first, that there are 3 separate ontologies represented here. In the middle, and the largest oval, is our ontology for Historical persons: the focus of this talk. The other two ontologies – for historical sources and for places, would be presumably rich and complex too, and although the OHC might make use of some of their classes because much of the work of prosopography involves sources and places, we wouldn&apos;t want to duplicate their conceptual model in the OHP.
Our prosopographical project (the red box in the middle) would be creating digital surrogate instances for People, Assertions, and the various other classes that are defined in the OHC. We’ve heard here about co-reference, but it has tended to focus on the co-referencing of things – of a particular person for example. Ontologies provide a different way to think about co-reference from what we have mainly heard here. The place of ontologies in the semantic web is rather subtle, and it supports rather different possible roles for an OHP in any particular prosopographical project. In the same way that there are two and a half different ways that an museum might relate its data to CIDOC-CRM, these projects might relate the model of their data in two and half different ways too. A prosopographical project might model its structure directly from that defined by the OHC, or, second, it might already have a model for its data, in which case it could use ontology tools to map its own entities as far as possible to those in the OHC. In addition, it might well subclass OHC entities to enrich the vocabulary to meet its own specialised needs.
Identifying people in history requires connections with historical documents, and, if there was a good online repository for the sources it was working with, it would not need to duplicate any of that repository&apos;s data but merely point to it in good Linked Data fashion. In a similar way, if there was a good repository of places (e.g. Pleiades) that covered the area the prosopography worked on, it could simply reference materials in it.
Like any good concept reference model, or computer ontology, the OHC would consist primarily of class and property definitions. The boxes represent some of the conceptual entities an OHC might represent – but they are not meant here to be read too formally!
Since it is representing prosopography work, one would expect one of the classes to be Persons. Call me conservative here, but I like the word &quot;Person&quot; rather than &quot;Agent&quot; or &quot;Actor&quot;, since both these terms imply a kind of active role that might fit with events a person is involved in, but fits less well with assertions that give the person attributes (states, or traits, to use TEI&apos;s terminology).
Sometimes a person is in fact a Group of people – a legal entity say. CIDOC-CRM&apos;s approach to the handling of Groups could fit well here.
People have names, and often more than one name. Here, CIDOC-CRM&apos;s appellation model is helpful. There will likely be a canonical name that will be attached to the person, but there is also likely to be variants on these names that are revealed by the sources, and I think they should be dealt with as a kind of Trait for the person.
All prosopographical work aims to make claims about the people it detects. Perhaps, indeed, it is possible to characterise a prosopography as a set of persons and a set of assertions about them. Here, I call these claims &quot;assertions&quot;, and, as in our &quot;factoid&quot; model, the assertions come out of an historian’s understanding of the sources that the project has worked with.
Assertions can be thought of as several different kinds. Names as they appear in the sources I’ve already touched on, but prosopography is often interested in sorting out people’s place in society, and thus all of our Prosopographies, and perhaps most others too, are interest in offices, titles, or occupations people are reported as holding. These are also best thought of in terms of TEI&apos;s states/traits. People will also be associated with events by the role they have in them. In most of our prosopographies, sources also connect people to possessions and places, and most define a whole range of kinds of relationships (family, legal, etc.) between individuals.
One other task, and one that our historian colleagues often don&apos;t expect, is that they find it useful to define and manage associated authority lists. Whether this is a list of the offices or titles that are used in the target society, or perhaps the kind of events that the target society recognises, authority lists become intellectual products of the prosopographical work as well as the list of persons. In the same way that a prosopography might expect to become an authority for the persons for its society that other projects can then reference through its URIs, it might find itself becoming an authority for these other authority lists that it has created. Thus, elements of the OHC to facilitate the management of authority lists would be useful.
So, I hope I have successfully explained why an Ontology for Historical Persons might be useful, and gave you some preliminary sense of what it might be like.
As you have probably guessed, however, I currently have no OHP. Why, if it is such a good idea, have I not built it, and then come to you to talk about it? This is because I believe that the OHP, if it is to have any hope of being useful and adopted, must have a kind of community engagement and approval.
I could formalise our work on prosopography into a kind of global model. However, in the same way that CIDOC-CRM, if it was to be useful, had to be created as a collaborative project so that it represented a range of different perspectives around a common domain as well as possible, the OHP would need to be built collaboratively so that it properly represented other perspectives on prosopography beyond our own at DDH.
So, where does this collaboration come from? CIDOC-CRM has been blessed with an organisation that is associated with its domain: the International Council of Museums, and one that was able to support and then promote its development. Unfortunately, Prosopographers are not so well organised!
In this situation, then, how could one move towards getting the creation of an OHP underway? With presentations like this, then, I mean to explore whether there is interest in the idea, and see who might want to be involved in building it. I am also giving a similar presentation to this at the annual international Digital Humanities conference in Lincoln, Nebraska in July. I have informally spoken about the idea to various prosopographical partners, and to people involved in other prosopographical projects that are aware of our work at DDH. I’ve also had a brief word about it with some people associated with the TEI.
If there is interest, I&apos;d expect that it would be possible to arrange a workshop to bring people together to develop the ideas further.
So, what do you think?