Présentation de l'étude de cas consacrée au Roman du Mont Saint-Michel (Guillaume de Saint-Pair), donnée dans le cadre de la Training School COST-IRHT "La transmission des textes : nouveaux outils, nouvelles approches" (Paris), par Stefanie Gehrke
This document discusses MongoDB aggregation operations. It provides examples of using aggregation stages like $group, $match, $sort, $limit, $project and $unwind to count, group, filter, and transform data from the restaurants collection. Specifically, it shows pipelines to count the number of documents by cuisine type sorted descending, filter by borough before grouping, unwind an array to count element occurrences, and calculate the number of "A" grades for each restaurant. The document explains how aggregation allows building a multi-stage data processing pipeline to transform and analyze MongoDB data without using SQL.
MongoDB offers two native data processing tools: MapReduce and the Aggregation Framework. MongoDB’s built-in aggregation framework is a powerful tool for performing analytics and statistical analysis in real-time and generating pre-aggregated reports for dashboarding. In this session, we will demonstrate how to use the aggregation framework for different types of data processing including ad-hoc queries, pre-aggregated reports, and more. At the end of this talk, you should walk aways with a greater understanding of the built-in data processing options in MongoDB and how to use the aggregation framework in your next project.
Biblissima's prototype on Medieval Manuscripts Illuminations and their ContextEquipex Biblissima
Présentation dans le cadre du "Workshop SW4SH" organisé dans le cadre de l'ESWC 2015 (Portoroz, Slovénie, 1er juin 2015 ) par Stefanie Gehrke (coordinatrice metadonnées Biblissima), Eduard Frunzeanu, Pauline Charbonnier et Marie Muffat.
Intervention de Stefanie Gehrke au Workshop "TEI and Neighbouring Standards" à la DiXiT Convention Week 2015 (Huygens ING, La Haye, 15 septembre 2015).
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
"Data integration with a façade.
The case of knowledge graph construction." is an overview of recent research in façade-based data access. The slides introduce core notions of façade-based data access and the design principles of SPARQL Anything, a system that allows querying of many formats (CSV, JSON, XML, HTML, Markdown , Excel, ...) in plain SPARQL.
Elasticsearch And Apache Lucene For Apache Spark And MLlibJen Aman
This document summarizes a presentation about using Elasticsearch and Lucene for text processing and machine learning pipelines in Apache Spark. Some key points:
- Elasticsearch provides text analysis capabilities through Lucene and can be used to clean, tokenize, and vectorize text for machine learning tasks.
- Elasticsearch integrates natively with Spark through Java/Scala APIs and allows indexing and querying data from Spark.
- A typical machine learning pipeline for text classification in Spark involves tokenization, feature extraction (e.g. hashing), and a classifier like logistic regression.
- The presentation proposes preparing text analysis specifications in Elasticsearch once and reusing them across multiple Spark pipelines to simplify the workflows and avoid data movement between systems
Learning Content Patterns from Linked DataEmir Muñoz
Linked Data (LD) datasets (e.g., DBpedia, Freebase) are
used in many knowledge extraction tasks due to the high variety of domains they cover. Unfortunately, many of these datasets do not provide a description for their properties and classes, reducing the users’ freedom to understand, reuse or enrich them. This work attempts to fill part of this lack by presenting an unsupervised approach to discover syntactic patterns in the properties used in LD datasets. This approach produces a content patterns database generated from the textual data (content) of
properties, which describes the syntactic structures that each property have. Our analysis enables (i) a human-understanding of syntactic patterns for properties in a LD dataset, and (ii) a structural description of properties that facilitates its reuse or extension. Results over DBpedia
dataset also show that our approach enables (iii) the detection of data inconsistencies, and (iv) the validation and suggestion of new values for a property. We also outline how the resulting database can be exploited
in several information extraction use cases.
2015.05.19 tom de nies - tin can2prov exposing interoperable provenance of ...tdenies
This document describes a mapping between the Experience API (xAPI) format for logging learning experiences and the W3C PROV standard for representing provenance. It outlines how xAPI statements can be converted to JSON-LD and then mapped to equivalent PROV concepts to make the learning logs interoperable. The mapping was tested on real xAPI statement data and allows logging of learning processes in a way that is machine-interpretable and can be queried and analyzed at scale. Going forward, the mapping will be used and tested in educational projects and systems to start leveraging the power of linked data for learning analytics.
This document discusses MongoDB aggregation operations. It provides examples of using aggregation stages like $group, $match, $sort, $limit, $project and $unwind to count, group, filter, and transform data from the restaurants collection. Specifically, it shows pipelines to count the number of documents by cuisine type sorted descending, filter by borough before grouping, unwind an array to count element occurrences, and calculate the number of "A" grades for each restaurant. The document explains how aggregation allows building a multi-stage data processing pipeline to transform and analyze MongoDB data without using SQL.
MongoDB offers two native data processing tools: MapReduce and the Aggregation Framework. MongoDB’s built-in aggregation framework is a powerful tool for performing analytics and statistical analysis in real-time and generating pre-aggregated reports for dashboarding. In this session, we will demonstrate how to use the aggregation framework for different types of data processing including ad-hoc queries, pre-aggregated reports, and more. At the end of this talk, you should walk aways with a greater understanding of the built-in data processing options in MongoDB and how to use the aggregation framework in your next project.
Biblissima's prototype on Medieval Manuscripts Illuminations and their ContextEquipex Biblissima
Présentation dans le cadre du "Workshop SW4SH" organisé dans le cadre de l'ESWC 2015 (Portoroz, Slovénie, 1er juin 2015 ) par Stefanie Gehrke (coordinatrice metadonnées Biblissima), Eduard Frunzeanu, Pauline Charbonnier et Marie Muffat.
Intervention de Stefanie Gehrke au Workshop "TEI and Neighbouring Standards" à la DiXiT Convention Week 2015 (Huygens ING, La Haye, 15 septembre 2015).
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
"Data integration with a façade.
The case of knowledge graph construction." is an overview of recent research in façade-based data access. The slides introduce core notions of façade-based data access and the design principles of SPARQL Anything, a system that allows querying of many formats (CSV, JSON, XML, HTML, Markdown , Excel, ...) in plain SPARQL.
Elasticsearch And Apache Lucene For Apache Spark And MLlibJen Aman
This document summarizes a presentation about using Elasticsearch and Lucene for text processing and machine learning pipelines in Apache Spark. Some key points:
- Elasticsearch provides text analysis capabilities through Lucene and can be used to clean, tokenize, and vectorize text for machine learning tasks.
- Elasticsearch integrates natively with Spark through Java/Scala APIs and allows indexing and querying data from Spark.
- A typical machine learning pipeline for text classification in Spark involves tokenization, feature extraction (e.g. hashing), and a classifier like logistic regression.
- The presentation proposes preparing text analysis specifications in Elasticsearch once and reusing them across multiple Spark pipelines to simplify the workflows and avoid data movement between systems
Learning Content Patterns from Linked DataEmir Muñoz
Linked Data (LD) datasets (e.g., DBpedia, Freebase) are
used in many knowledge extraction tasks due to the high variety of domains they cover. Unfortunately, many of these datasets do not provide a description for their properties and classes, reducing the users’ freedom to understand, reuse or enrich them. This work attempts to fill part of this lack by presenting an unsupervised approach to discover syntactic patterns in the properties used in LD datasets. This approach produces a content patterns database generated from the textual data (content) of
properties, which describes the syntactic structures that each property have. Our analysis enables (i) a human-understanding of syntactic patterns for properties in a LD dataset, and (ii) a structural description of properties that facilitates its reuse or extension. Results over DBpedia
dataset also show that our approach enables (iii) the detection of data inconsistencies, and (iv) the validation and suggestion of new values for a property. We also outline how the resulting database can be exploited
in several information extraction use cases.
2015.05.19 tom de nies - tin can2prov exposing interoperable provenance of ...tdenies
This document describes a mapping between the Experience API (xAPI) format for logging learning experiences and the W3C PROV standard for representing provenance. It outlines how xAPI statements can be converted to JSON-LD and then mapped to equivalent PROV concepts to make the learning logs interoperable. The mapping was tested on real xAPI statement data and allows logging of learning processes in a way that is machine-interpretable and can be queried and analyzed at scale. Going forward, the mapping will be used and tested in educational projects and systems to start leveraging the power of linked data for learning analytics.
Mapping Subsets of Scholarly InformationPaul Houle
The document discusses using machine learning techniques like support vector machines (SVMs) to analyze and classify academic literature from a large online corpus like arXiv. It finds that SVMs can accurately identify documents belonging to large categories with over 10,000 documents but struggles with smaller categories of under 500 documents. To improve recall for small categories, the SVM outputs are converted to probabilities using a sigmoid function rather than relying on signed distances from the hyperplane alone.
Scala and big data in ICM. Scoobie, Scalding, Spark, Stratosphere. Scalar 2014Michał Oniszczuk
This document discusses various big data frameworks including Spark, Scoobi, Hadoop, and GraphX. It provides an example of using Spark to interactively analyze log data stored on HDFS. Spark allows loading data into memory and running multiple queries efficiently. The document also discusses benefits of Spark such as its scalability, libraries for SQL, streaming, machine learning and more.
Collective entity linking with WSRM DocEng'19ngamou
The document discusses using knowledge base semantics in context-aware entity linking. It proposes a collective entity linking approach that identifies all entities mentioned in a text at once by leveraging an RDF knowledge base. It introduces a weighted semantic relatedness measure to compute the collective coherence score between candidate entities. Experimental results show the approach outperforms state-of-the-art methods on several benchmark datasets.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
bridging formal semantics and social semantics on the webFabien Gandon
The document summarizes research on bridging formal semantics and social semantics on the web. It discusses:
1) The Wimmics research team which studies web-instrumented machine interactions, communities, and semantics using a multidisciplinary approach and typed graphs.
2) The challenge of analyzing, modeling, and formalizing social semantic web applications for communities by combining formal semantics and social semantics.
3) Examples of past work that have structured folksonomies, combined metric spaces for tags, and analyzed sociograms and social networks.
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...MongoDB
This document discusses using machine learning and various machine learning platforms like MongoDB, Spark, Watson, Azure, and AWS to engage customers. It provides examples of using these platforms for tasks like topic detection on tweets, sentiment analysis, recommendation engines, forecasting, and marketing response prediction. It also discusses architectures, languages, and functions supported by tools like Mahout, MLlib, and Watson Developer Cloud.
Multimedia Data Navigation and the Semantic Web (SemTech 2006)Bradley Allen
The document describes a system for faceted navigation of multimedia content using semantic web technologies. It discusses using ontologies expressed in RDF(S) and OWL to represent metadata, BBC rush footage used as a case study, and visual facets for color, texture and combinations that were generated through MPEG-7 feature extraction and self-organizing map clustering. The system allows retrieval of clips and shots based on textual and visual facet filtering of the RDF represented multimedia data.
XML London 2013 presentation
The paper addresses the topic of frameworks intended to speed up the development of web applications using the XML stack (XQuery, XSLT and native XML databases). These frameworks must offer the ability to produce exploitable XML content by web users - without technical skills – and must be simple enough to lower the barrier entry cost for developers. This is particularly true for a low-budget class of applications that we call Small Data applications. This article presents Oppidum, a lightweight open source framework to build web applications relying on a RESTful approach, sustained by intuitive authoring facilities to populate an XML database. This is illustrated with a simple application created for editing this article on the web.
The document summarizes a two-year project to develop an ontology and data model to represent scholarly works and their usage. It will analyze bibliographic data and usage data from sources like journals, papers, and online usage logs to develop metrics to quantify the scholarly community. The first year will focus on developing the ontology and algorithms while the second year will analyze the results and report findings.
- Biblissima is a project that aims to provide a single access point to over 40 databases and 3 image repositories related to medieval and Renaissance manuscripts.
- It uses semantic web technologies to integrate metadata and link resources. Sc:Manifests in the IIIF framework are used to represent manuscripts and their components like images, transcriptions and annotations.
- TEI files can be transformed into IIIF Sc:Manifests to support displaying transcriptions from the TEI files in a viewer. This allows linking manuscript components and metadata while retaining the richness of encoding in the source files.
Maria Patterson - Building a community fountain around your data streamPyData
1. The document discusses building a community around astronomical data streams using Apache Kafka for transport, Apache Avro for data formatting, and Apache Spark for filtering.
2. It provides examples of using these tools to prototype an alert stream, including mock data and filtering demonstrations.
3. The tools allow open access to large astronomical data streams in a scalable and flexible way for diverse creative uses.
In this talk, we present two emerging, popular open source projects: Spark and Shark. Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. It outperform Hadoop by up to 100x in many real-world applications. Spark programs are often much shorter than their MapReduce counterparts thanks to its high-level APIs and language integration in Java, Scala, and Python. Shark is an analytic query engine built on top of Spark that is compatible with Hive. It can run Hive queries much faster in existing Hive warehouses without modifications.
These systems have been adopted by many organizations large and small (e.g. Yahoo, Intel, Adobe, Alibaba, Tencent) to implement data intensive applications such as ETL, interactive SQL, and machine learning.
Integrating Heterogeneous Data Sources in the Web of DataFranck Michel
These are the slides of a 40mn presentation I've made at the CNRS Software Development days (JDEV 2017), in Marseille (France), July 5th, 2017.
Here is the Webcast, in French: https://webcast.in2p3.fr/videos-integrer_des_sources_de_donnees_heterogenes_dans_le_web_de_donnees
The document proposes a graphical visualization of knowledge articles from sources like Wikipedia to provide an interactive and connected graph representation. It discusses algorithms to extract keywords, calculate weights, and map keywords to links between articles. A mathematical model and feasibility analysis are presented, along with examples of a proposed user interface and publications/conferences where the paper was submitted. The goal is to provide an easily understandable overview of article topics to reduce reading time.
ACCU 2011 Introduction to Scala: An Object Functional Programming LanguagePeter Pilgrim
Peter Pilgrim
Oracle Java Champion
Wednesday 13 April 2011 in Cherwell Room 11am gave a 90 minute presentation on the subject
Introduction into Scala : The Object-Functional Programming Language
This is a presentation that out lined the technical feel of Scala 2.8 programming language.
“More and more programmers are going to need to understand functional programming at a productive level. Especially over the next 5 to 10 years” - IBM, Jason Stidd, Functional Programming Blog, 2010
The presentation covers:
Classes and Companion Objects
Function Objects
Case classes and Pattern matching
Traits, Mix-ins and Compositions
Scala Collections
Library Frameworks
“Scala enjoys a singular advantage: performance. The language is compiled to optimised byte codes and runs as fast as native Java” Infoworld, 2010
Learn More:
http://xenonique.co.uk/blog/
peter.pilgrim@gmail.com
http://scala-lang.org/
twitter:peter_pilgrim
The document discusses the relationship between database systems and information retrieval (IR) systems. It notes that while they have traditionally operated in "parallel universes", addressing them together is now important for applications. It outlines some key differences and similarities between the two areas and discusses efforts to build more integrated database and IR platforms.
Da Biblissima a Biblissima+ : per un osservatorio delle culture scritteEquipex Biblissima
Présentation d'Anne-Marie Turcan-Verkerk lors de la LIBER Sixth Summer School on Trends in Manuscript Studies, University of Cassino (13 septembre 2021)
eScriptorium: An Open Source Platform for Historical Document AnalysisEquipex Biblissima
Par Daniel Stoekl Ben Ezra (Directeur d'études, EPHE-PSL, UMR 8546 AOrOc).
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
More Related Content
Similar to Roman du Mont Saint-Michel: Biblissima's case study with the University of Caen and the British Library
Mapping Subsets of Scholarly InformationPaul Houle
The document discusses using machine learning techniques like support vector machines (SVMs) to analyze and classify academic literature from a large online corpus like arXiv. It finds that SVMs can accurately identify documents belonging to large categories with over 10,000 documents but struggles with smaller categories of under 500 documents. To improve recall for small categories, the SVM outputs are converted to probabilities using a sigmoid function rather than relying on signed distances from the hyperplane alone.
Scala and big data in ICM. Scoobie, Scalding, Spark, Stratosphere. Scalar 2014Michał Oniszczuk
This document discusses various big data frameworks including Spark, Scoobi, Hadoop, and GraphX. It provides an example of using Spark to interactively analyze log data stored on HDFS. Spark allows loading data into memory and running multiple queries efficiently. The document also discusses benefits of Spark such as its scalability, libraries for SQL, streaming, machine learning and more.
Collective entity linking with WSRM DocEng'19ngamou
The document discusses using knowledge base semantics in context-aware entity linking. It proposes a collective entity linking approach that identifies all entities mentioned in a text at once by leveraging an RDF knowledge base. It introduces a weighted semantic relatedness measure to compute the collective coherence score between candidate entities. Experimental results show the approach outperforms state-of-the-art methods on several benchmark datasets.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
bridging formal semantics and social semantics on the webFabien Gandon
The document summarizes research on bridging formal semantics and social semantics on the web. It discusses:
1) The Wimmics research team which studies web-instrumented machine interactions, communities, and semantics using a multidisciplinary approach and typed graphs.
2) The challenge of analyzing, modeling, and formalizing social semantic web applications for communities by combining formal semantics and social semantics.
3) Examples of past work that have structured folksonomies, combined metric spaces for tags, and analyzed sociograms and social networks.
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...MongoDB
This document discusses using machine learning and various machine learning platforms like MongoDB, Spark, Watson, Azure, and AWS to engage customers. It provides examples of using these platforms for tasks like topic detection on tweets, sentiment analysis, recommendation engines, forecasting, and marketing response prediction. It also discusses architectures, languages, and functions supported by tools like Mahout, MLlib, and Watson Developer Cloud.
Multimedia Data Navigation and the Semantic Web (SemTech 2006)Bradley Allen
The document describes a system for faceted navigation of multimedia content using semantic web technologies. It discusses using ontologies expressed in RDF(S) and OWL to represent metadata, BBC rush footage used as a case study, and visual facets for color, texture and combinations that were generated through MPEG-7 feature extraction and self-organizing map clustering. The system allows retrieval of clips and shots based on textual and visual facet filtering of the RDF represented multimedia data.
XML London 2013 presentation
The paper addresses the topic of frameworks intended to speed up the development of web applications using the XML stack (XQuery, XSLT and native XML databases). These frameworks must offer the ability to produce exploitable XML content by web users - without technical skills – and must be simple enough to lower the barrier entry cost for developers. This is particularly true for a low-budget class of applications that we call Small Data applications. This article presents Oppidum, a lightweight open source framework to build web applications relying on a RESTful approach, sustained by intuitive authoring facilities to populate an XML database. This is illustrated with a simple application created for editing this article on the web.
The document summarizes a two-year project to develop an ontology and data model to represent scholarly works and their usage. It will analyze bibliographic data and usage data from sources like journals, papers, and online usage logs to develop metrics to quantify the scholarly community. The first year will focus on developing the ontology and algorithms while the second year will analyze the results and report findings.
- Biblissima is a project that aims to provide a single access point to over 40 databases and 3 image repositories related to medieval and Renaissance manuscripts.
- It uses semantic web technologies to integrate metadata and link resources. Sc:Manifests in the IIIF framework are used to represent manuscripts and their components like images, transcriptions and annotations.
- TEI files can be transformed into IIIF Sc:Manifests to support displaying transcriptions from the TEI files in a viewer. This allows linking manuscript components and metadata while retaining the richness of encoding in the source files.
Maria Patterson - Building a community fountain around your data streamPyData
1. The document discusses building a community around astronomical data streams using Apache Kafka for transport, Apache Avro for data formatting, and Apache Spark for filtering.
2. It provides examples of using these tools to prototype an alert stream, including mock data and filtering demonstrations.
3. The tools allow open access to large astronomical data streams in a scalable and flexible way for diverse creative uses.
In this talk, we present two emerging, popular open source projects: Spark and Shark. Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. It outperform Hadoop by up to 100x in many real-world applications. Spark programs are often much shorter than their MapReduce counterparts thanks to its high-level APIs and language integration in Java, Scala, and Python. Shark is an analytic query engine built on top of Spark that is compatible with Hive. It can run Hive queries much faster in existing Hive warehouses without modifications.
These systems have been adopted by many organizations large and small (e.g. Yahoo, Intel, Adobe, Alibaba, Tencent) to implement data intensive applications such as ETL, interactive SQL, and machine learning.
Integrating Heterogeneous Data Sources in the Web of DataFranck Michel
These are the slides of a 40mn presentation I've made at the CNRS Software Development days (JDEV 2017), in Marseille (France), July 5th, 2017.
Here is the Webcast, in French: https://webcast.in2p3.fr/videos-integrer_des_sources_de_donnees_heterogenes_dans_le_web_de_donnees
The document proposes a graphical visualization of knowledge articles from sources like Wikipedia to provide an interactive and connected graph representation. It discusses algorithms to extract keywords, calculate weights, and map keywords to links between articles. A mathematical model and feasibility analysis are presented, along with examples of a proposed user interface and publications/conferences where the paper was submitted. The goal is to provide an easily understandable overview of article topics to reduce reading time.
ACCU 2011 Introduction to Scala: An Object Functional Programming LanguagePeter Pilgrim
Peter Pilgrim
Oracle Java Champion
Wednesday 13 April 2011 in Cherwell Room 11am gave a 90 minute presentation on the subject
Introduction into Scala : The Object-Functional Programming Language
This is a presentation that out lined the technical feel of Scala 2.8 programming language.
“More and more programmers are going to need to understand functional programming at a productive level. Especially over the next 5 to 10 years” - IBM, Jason Stidd, Functional Programming Blog, 2010
The presentation covers:
Classes and Companion Objects
Function Objects
Case classes and Pattern matching
Traits, Mix-ins and Compositions
Scala Collections
Library Frameworks
“Scala enjoys a singular advantage: performance. The language is compiled to optimised byte codes and runs as fast as native Java” Infoworld, 2010
Learn More:
http://xenonique.co.uk/blog/
peter.pilgrim@gmail.com
http://scala-lang.org/
twitter:peter_pilgrim
The document discusses the relationship between database systems and information retrieval (IR) systems. It notes that while they have traditionally operated in "parallel universes", addressing them together is now important for applications. It outlines some key differences and similarities between the two areas and discusses efforts to build more integrated database and IR platforms.
Similar to Roman du Mont Saint-Michel: Biblissima's case study with the University of Caen and the British Library (20)
Da Biblissima a Biblissima+ : per un osservatorio delle culture scritteEquipex Biblissima
Présentation d'Anne-Marie Turcan-Verkerk lors de la LIBER Sixth Summer School on Trends in Manuscript Studies, University of Cassino (13 septembre 2021)
eScriptorium: An Open Source Platform for Historical Document AnalysisEquipex Biblissima
Par Daniel Stoekl Ben Ezra (Directeur d'études, EPHE-PSL, UMR 8546 AOrOc).
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...Equipex Biblissima
Par Gilles Bertin (Ingénieur de recherche, CNAM).
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
Appliquer les techniques d'apprentissage profond pour détecter les enluminure...Equipex Biblissima
Par Victoria Eyharabide (Professeur associé, Laboratoire STIH, Sorbonne Université), Fouad Aouinti (Chercheur post-doctorant, Laboratoire STIH, Sorbonne Université), Xavier Fresquet (Directeur adjoint, Sorbonne Center for Artificial Intelligence - SCAI, Sorbonne Université)
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
Représentations du chant du Moyen Âge dans les images IIIFEquipex Biblissima
Par Valérie Le Page (Doctorante, Laboratoire IReMus, Sorbonne Université) et Victoria Eyharabide (Professeur associé, Laboratoire STIH, Sorbonne Université).
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...Equipex Biblissima
Par Pauline Rivière (Chef de projet numérisation à la Bibliothèque Sainte-Geneviève).
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
Mise en œuvre de IIIF pour la reconnaissance automatique de documentsEquipex Biblissima
Par Christopher Kermorvant (Président de TEKLIA).
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
Par Adrien Desseigne (Ingénieur d'études, concepteur et développeur d'applications web, TGIR Huma-Num).
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
Par Régis Robineau (Ingénieur d'études, coordinateur de l'équipe Biblissima, membre du Technical Review Committee de IIIF).
Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021
Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...Equipex Biblissima
This document summarizes a presentation about Biblissima IIIF-Collections, a discovery platform for digital manuscripts and rare books. It aggregates metadata from 10 digital libraries containing around 65,000 IIIF manifests. The platform allows for cross-collection searching and filtering by library, language, and date. It also links to additional information on authors, manuscripts, and entities in the Biblissima authority database. The presentation outlines the workflow used to harvest, process, and ingest metadata into Elasticsearch for search. It also discusses feedback provided to data providers and future plans to expand the authority database.
Présentation par Régis Robineau dans le cadre de l'atelier Campus Condorcet “Référentiels géo-historiques sémantisés pour les humanités” (Ecole nationale des chartes, 14 mai 2019)
Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...Equipex Biblissima
Présentation lors du séminaire sur l'étude des provenances dans les bibliothèques territoriales françaises (CERL, MCC, Bibliothèque municipale de Lyon), le 3 avril 2019 à Lyon. Par Régis Robineau (Biblissima - Campus Condorcet, EPHE-PSL).
Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)Equipex Biblissima
Présentation des protocoles IIIF dans le cadre de la formation au Diplôme de conservateur de bibliothèque de l'Enssib (DCB 27), à Villeurbanne le 23 janvier 2019. Par Régis Robineau (Biblissima - Campus Condorcet, EPHE-PSL).
Biblissima is a data facility that aims to federate digital libraries, structure research data and communities, train researchers, and facilitate access to and reuse of textual and documentary resources. It has over 50 partner projects involving libraries, archives, and universities in France, the UK, Canada, and the US. Biblissima develops tools like Collatinus and Eulexis for analyzing Latin and Greek texts. It also organizes summer courses for cultivating young researchers. The Biblissima portal aggregates data from over 10 sources to visualize manuscripts and books, with features for searching, browsing, and comparing resources using IIIF standards.
A la recherche du patrimoine écrit avec le portail BiblissimaEquipex Biblissima
Présentation par Régis Robineau lors de la journée d'étude Médiadix et URFIST de Paris "Revisiter le patrimoine en bibliothèque : valorisation, médiatisation et démocratisation" (Pôle métiers du livre, Université Paris Nanterre - 14 décembre 2018)
Browse and Visualize Manuscripts Illuminations with IIIFEquipex Biblissima
This document discusses using IIIF to browse and visualize illuminations from medieval manuscripts in the Biblissima Portal. It summarizes the Biblissima Portal's focus on medieval manuscript collections and its aggregation of metadata. It then describes two databases of manuscript illuminations and metadata that have been made available through IIIF, including over 200,000 illuminations from the BnF. Finally, it discusses potential improvements to better integrate supplemental metadata and provide a customized UI for exploring illuminations.
Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...Equipex Biblissima
Présentation par Eduard Frunzeanu et Régis Robineau lors du workshop Zoomathia “Zoological an zoology-related Databases” (Muséum national d'histoire naturelle, Paris - 23 novembre 2018)
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
Session 1 - Intro to Robotic Process Automation.pdfUiPathCommunity
👉 Check out our full 'Africa Series - Automation Student Developers (EN)' page to register for the full program:
https://bit.ly/Automation_Student_Kickstart
In this session, we shall introduce you to the world of automation, the UiPath Platform, and guide you on how to install and setup UiPath Studio on your Windows PC.
📕 Detailed agenda:
What is RPA? Benefits of RPA?
RPA Applications
The UiPath End-to-End Automation Platform
UiPath Studio CE Installation and Setup
💻 Extra training through UiPath Academy:
Introduction to Automation
UiPath Business Automation Platform
Explore automation development with UiPath Studio
👉 Register here for our upcoming Session 2 on June 20: Introduction to UiPath Studio Fundamentals: https://community.uipath.com/events/details/uipath-lagos-presents-session-2-introduction-to-uipath-studio-fundamentals/
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
In our second session, we shall learn all about the main features and fundamentals of UiPath Studio that enable us to use the building blocks for any automation project.
📕 Detailed agenda:
Variables and Datatypes
Workflow Layouts
Arguments
Control Flows and Loops
Conditional Statements
💻 Extra training through UiPath Academy:
Variables, Constants, and Arguments in Studio
Control Flow in Studio
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
High performance Serverless Java on AWS- GoTo Amsterdam 2024Vadym Kazulkin
Java is for many years one of the most popular programming languages, but it used to have hard times in the Serverless community. Java is known for its high cold start times and high memory footprint, comparing to other programming languages like Node.js and Python. In this talk I'll look at the general best practices and techniques we can use to decrease memory consumption, cold start times for Java Serverless development on AWS including GraalVM (Native Image) and AWS own offering SnapStart based on Firecracker microVM snapshot and restore and CRaC (Coordinated Restore at Checkpoint) runtime hooks. I'll also provide a lot of benchmarking on Lambda functions trying out various deployment package sizes, Lambda memory settings, Java compilation options and HTTP (a)synchronous clients and measure their impact on cold and warm start times.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Roman du Mont Saint-Michel: Biblissima's case study with the University of Caen and the British Library
1. Roman du Mont Saint Michel
A medieval novel in three books.
Case Study Shared Canvas by University of
Caen, the British Library and Biblissima
2. Roman du Mont Saint Michel
Source:
Two manuscripts
- BL Additional 10289 (date 1275-1300)
- BL Additional 26876 (date 1340)
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
3. Roman du Mont Saint Michel
Editions
Print edition:
Guillaume de Saint-Pair, Le roman du Mont Saint-Michel, Les manuscrits
du Mont Saint-Michel : textes fondateurs II, C. Bougy (éd.), 2009.
Access point electronic edition:
http://www.unicaen.fr/puc/sources/gsp/index.php?page=sommaire
Online catalogue BL:
http://www.bl.uk/manuscripts/FullDisplay.aspx?ref=Add_MS_10289
http://www.bl.uk/catalogues/illuminatedmanuscripts/record.asp?
MSID=19373&CollID=27&NStart=26876
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
4. Roman du Mont Saint Michel
Electronic edition in TEI P5 compiling the
two textual witnesses (“A” and “B”)
Translation into modern french in TEI P5
No use of <surface> and <zone> (in 2006),
digitisation realised end of 2013 by BL
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
5. Electronic Editions in TEI-P5
- Chapter 11 of the TEI-Guidelines:
Representation of the Primary Source
- 11.1 Digital Facsimile
- 11.2.2 Embedded Transcription
=> use of <sourceDoc> in the case where such images
are complemented by a documentary transcription
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
6. Electronic Editions in TEI
facsimile contains a representation of some written source in the form of a set of images rather
than as transcribed or encoded text.
sourceDoc contains a transcription or other representation of a single source document
potentially forming part of a dossier génétique or collection of sources.
surface defines a written surface as a two-dimensional coordinate space, optionally grouping
one or more graphic representations of that space, zones of interest within that space, and
transcriptions of the writing within them.
surfaceGrp defines any kind of useful grouping of written surfaces, for example the recto and
verso of a single leaf, which the encoder wishes to treat as a single unit.
zone defines any two-dimensional area within a surface element.
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
7. Technical details MIRADOR / TEI
MIRADOR displays TEI-transcriptions line by
line transformed to JSON-LD
JSON-LD (= RDF Serialization)
8. Technical details MIRADOR / TEI
http://demos.biblissima-condorcet.fr/mirador/
1. Go to “BL Add MS 10289”
2. Click on the icon i(nfo)
3. http://sanddragon.bl.uk/IIIFMetadataService/add_ms_10289.json
10. TEI 2 JSON-LD for transcriptions
For each <pb> (= for each canvas) we
create an AnnotationList.
That list contains several resources, here
transcriptions per line.
12. TEI-P5 <surface> and <zone>
TEI-P5:
@ulx, @uly, @lrx, @lry
(upper left x, upper left y,
lower right x, lower right y)
(0,0)
(6049,8552)
ulx
uly
lrx
lry
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
13. TEI-P5 versus Shared Canvas
Shared Canvas : x, y, w, h
(x, y, width, height)
(0,0)
x
y
w
8552
6049
h
14. Adding #xywh to resources (JSON-LD)
Approximate semiautomatic approach to
surfaces and zones for transcripts
+ easy to use
- not exact, works only for very regular
manuscripts
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
15. Adding #xywh to resources (JSON-LD)
Use any image manipulation
program, e.g. Gimp to
determine the average location
of the text on the images.
In the image manipulation
program the cursor’s position is usually
displayed below the image.
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
16. Adding #xywh to resources (JSON-LD)
Differenciate between recto and verso pages
…
<xsl:variable name="lx"><xsl:choose>
<xsl:when test="contains($pg,'r')">
<xsl:value-of select="$r_x"/></xsl:when>
<xsl:otherwise><xsl:value-of select="$v_x"/></xsl:otherwise>
</xsl:choose>
</xsl:variable>
…
Compute the average hight of a line
line_height = text_height / number_of_lines
Line n starts at
y = text_y + (n - 1) * line_height
→ <xsl:text>"on":"http://sanddragon.bl.uk/IIIFMetadataService/canvas/folio-</xsl:text><xsl:value-of select="$pg"/>
<xsl:text>.json#xywh=</xsl:text><xsl:value-of select="($lx)"/><xsl:text>,</xsl:text><xsl:value-of select="($y)"/><xsl:text>,
</xsl:text><xsl:value-of select="($text_width)"/><xsl:text>,</xsl:text><xsl:value-of select="($line_height)"/>
<xsl:text>"</xsl:text>
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
17. Adding <zones> and <surfaces> in TEI
Parallel approach to transscript -
<facsimile>
referencing via @xml:id and #facs
+ high level of detail possible
+ text can be encoded with more possibilities
+ text and image data can be seperated
- complexity is difficult for transformation and display
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
18. Technical details TEI
...
… <facsimile>
...
<surface ulx="0" uly="0" lrx="6049" lry="8552">
<zone xml:id=”f8v”
ulx="0" uly="0" lrx="6049" lry="8552">
<graphic url="http://sanddragon.bl.uk/IIIFImageService/add_ms_10289_f008v/full/full/
0/native.jpg"/>
</zone>
<zone xml:id=”vers449”
ulx="2023" uly="540" lrx="5363" lry="736"/>
<zone xml:id=”vers450”
ulx="2023" uly="737" lrx="5363" lry="944"/>
…
</facsimile> ...
...<div><div><lg> …
<l n="449" aid:pstyle="txt_Original_Vers" xml:id="vers449" facs=”#vers449”>
<pb ed="A" n="8v" xml:id=”f8v” facs=”#f8v”/>De la forest a feit areine</l>
<l n="450" aid:pstyle="txt_Original_Vers" xml:id="vers450" facs=”#vers450”>
Entor le mont, et bele et pleine<note type="marginal" xml:id="AFRftn207">
areigne.</note>.</l> …
</div></div></lg>...
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
19. Technical details MIRADOR / TEI
XSL code “ulx, uly, lrx, lry” (TEI - facsimile)
to “x, y, w, h” (Shared Canvas)
...
<xsl:for-each-group select="/TEI/text/body//lg/l" group-starting-with="/TEI/text/body//lg/l[pb]">
…
<xsl:for-each select="current-group()">
...
<xsl:variable name="id"><xsl:value-of select="substring-after(@facs,’#’)"/></xsl:variable>
<!-- width = lower right x - upper left x →
<xsl:variable name="width"><xsl:value-of select="/TEI/facsimile/surface/zone[@xml:id=$id]/@lrx -
/TEI/facsimile/surface/zone[@xml:id=$id]/@ulx"/></xsl:variable>
<!-- height = lower right y - upper left y -->
<xsl:variable name="height"><xsl:value-of select="/TEI/facsimile/surface/zone[@xml:id=$id]/@lry -
/TEI/facsimile/surface/zone[@xml:id=$id]/@uly"/></xsl:variable>
<xsl:text>"on":"http://sanddragon.bl.uk/IIIFMetadataService/canvas/folio-</xsl:text><xsl:value-of
select="$pg"/><xsl:text>.json#xywh=</xsl:text><xsl:value-of select="@lrx"/><xsl:text>,</xsl:text><xsl:value-of select="@lry"/
><xsl:text>,</xsl:text><xsl:value-of select="($width)"/><xsl:text>,</xsl:text><xsl:value-of select="($height)"/><xsl:text>"</xsl:text>
… </xsl:for-each>
…</xsl:for-each-group> ...
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
20. Technical details MIRADOR / TEI
Embedded approach to transscript -
<sourceDoc>
+ direct approach
+ clearity due to the limitation to transscripts only
+ simpler for XSLT transforms
- mixing of text and image data
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
21. Technical details TEI
… <sourceDoc>
...
<surface ulx="0" uly="0" lrx="6049" lry="8552">
<zone xml:id=”f8v”ulx="0" uly="0" lrx="6049" lry="8552">
<graphic url="http://sanddragon.bl.uk/IIIFImageService/add_ms_10289_f008v/full/full/0/native.jpg"/>
</zone>
<zone ulx="2023" uly="540" lrx="5363" lry="736 »>
<line n="449" aid:pstyle="txt_Original_Vers" xml:id="vers449" >
<pb ed="A" n="8v" xml:id=”f8v” facs=”#f8v”/>De la forest a feit areine</line>
</zone>
<zone ulx="2023" uly="737" lrx="5363" lry="944 »>
<line n="450" aid:pstyle="txt_Original_Vers" xml:id="vers450">Entor le mont,
et bele et pleine<note type="marginal" xml:id="AFRftn207">areigne.</note>.</line>
</zone>
…
</sourceDoc> ...
...
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)
22. Technical details MIRADOR / TEI
XSL code “ulx, uly, lrx, lry” (TEI - sourceDoc)
to “x, y, w, h” (Shared Canvas)
...
<xsl:for-each select="/TEI/sourceDoc//surface" group-starting-with="TEI/sourceDoc//surface">
…
<xsl:for-each select= ".//line">
...
<xsl:variable name="id"><xsl:value-of select="@xml:id)"/></xsl:variable>
<!-- width = lower right x - upper left x -->
<xsl:variable name="width"><xsl:value-of select="../zone/@lrx - ../zone/@ulx"/></xsl:variable>
<!-- height = lower right y - upper left y -->
<xsl:variable name="height"><xsl:value-of select="../zone/@lry - ../zone/@uly"/></xsl:variable>
<xsl:text>"on":"http://sanddragon.bl.uk/IIIFMetadataService/canvas/folio-</xsl:text><xsl:value-of
select="$pg"/><xsl:text>.json#xywh=</xsl:text><xsl:value-of select="../zone/@lrx"/><xsl:text>,</xsl:text><xsl:value-of select="../zone/
@lry"/><xsl:text>,</xsl:text><xsl:value-of select="($width)"/><xsl:text>,</xsl:text><xsl:value-of select="($height)"/><xsl:text>"</xsl:text>
…
</xsl:for-each>
…
</xsl:for-each-group>...
Stefanie Gehrke - Pool Biblissima. Training School COST-IRHT (02.04.2014)