Slides of the presentation of the paper Reflections on Cultural Heritage and Digital Humanities by Arianna Ciula and Øyvind Eide in DATeCH 2014. #digidays
DODDLE-OWL: A Domain Ontology Construction Tool with OWLTakeshi Morita
In this paper, we propose a domain ontology construction tool with OWL. The advantage of our tool is focusing the quality refinement phase of ontology construction. Through interactive support for refining the initial ontology, OWL-Lite level ontology, which consists of taxonomic relationships (defined as classes) and non-taxonomic relationships (defined as properties), is constructed effectively. The tool also provides semi-automatic generation of the initial ontology using domain specific documents and general ontologies.
Slides of the presentation of the paper An approach to Unsupervised Historical Text Normalisation by Petar Mitankin, Stefan Gerdjikov and Stoyan Mihov in DATeCH 2014. #digidays
i2S DigiBook is a 35-year innovative group specialized in book scanning and digitization solutions for libraries. Their flagship product, LIMB, is an all-in-one solution that allows users to process scanned documents, apply image enhancements and quality control, perform OCR, generate and export metadata and digital outputs like PDFs and images, and publish content directly to a digital library. LIMB aims to create higher quality digital assets from textual cultural collections more quickly and at a lower cost than traditional digitization methods by automating tasks and workflows.
This document summarizes an experiment on automatically assigning topics to text from a historical encyclopedia using optical character recognition (OCR).
The researchers tested automated topic assignment on 14 pages from an 18th century German encyclopedia that had been OCR'd. They analyzed the recall and precision of topic assignment on the OCR'd text, original text, and original text with modernized spelling. Topic assignment was challenging due to OCR errors and historical topics not represented in their topic hierarchy.
While automated topic assignment showed some value in organizing the historical texts, errors limited its usefulness if precision needed to be high. The researchers identified ways to improve precision, such as updating the topic hierarchy, and proposed combining it with social tagging to
The document discusses research on representing computer-supported collaborative learning (CSCL) scripts using the IMS Learning Design specification. Key points:
- Researchers studied how to express CSCL macro-scripts and mechanisms using IMS Learning Design to make implementations more customizable and interoperable.
- Collaborative learning flow patterns from literature were coded in IMS Learning Design elements and attributes to demonstrate capabilities and limitations.
- An authoring tool called Collage was developed to allow designing processes based on learning design patterns and templates.
- Later work focused on embedding assessment activities, ensuring interoperability with IMS Question and Test Interoperability, and developing shared learning design environments and communities.
The document discusses using an "open design" approach to make better use of open educational resources (OERs) and technologies in learning design. It involves representing learning designs visually using tools like CompendiumLD to make the designs more explicit and shareable. Pedagogical patterns are also proposed as a way to structure designs and distill best practices. The approach was explored in workshops and aimed to help educators more effectively design pedagogically informed learning activities that leverage OERs and technologies.
Presented as part of the course on 'Information visualization' (533c) taught by professor Tamara munzner at UBC in 2009.
http://people.cs.ubc.ca/~tmm/courses/533-09/
More info @ http://www.cs.ubc.ca/~mohanr/
DODDLE-OWL: A Domain Ontology Construction Tool with OWLTakeshi Morita
In this paper, we propose a domain ontology construction tool with OWL. The advantage of our tool is focusing the quality refinement phase of ontology construction. Through interactive support for refining the initial ontology, OWL-Lite level ontology, which consists of taxonomic relationships (defined as classes) and non-taxonomic relationships (defined as properties), is constructed effectively. The tool also provides semi-automatic generation of the initial ontology using domain specific documents and general ontologies.
Slides of the presentation of the paper An approach to Unsupervised Historical Text Normalisation by Petar Mitankin, Stefan Gerdjikov and Stoyan Mihov in DATeCH 2014. #digidays
i2S DigiBook is a 35-year innovative group specialized in book scanning and digitization solutions for libraries. Their flagship product, LIMB, is an all-in-one solution that allows users to process scanned documents, apply image enhancements and quality control, perform OCR, generate and export metadata and digital outputs like PDFs and images, and publish content directly to a digital library. LIMB aims to create higher quality digital assets from textual cultural collections more quickly and at a lower cost than traditional digitization methods by automating tasks and workflows.
This document summarizes an experiment on automatically assigning topics to text from a historical encyclopedia using optical character recognition (OCR).
The researchers tested automated topic assignment on 14 pages from an 18th century German encyclopedia that had been OCR'd. They analyzed the recall and precision of topic assignment on the OCR'd text, original text, and original text with modernized spelling. Topic assignment was challenging due to OCR errors and historical topics not represented in their topic hierarchy.
While automated topic assignment showed some value in organizing the historical texts, errors limited its usefulness if precision needed to be high. The researchers identified ways to improve precision, such as updating the topic hierarchy, and proposed combining it with social tagging to
The document discusses research on representing computer-supported collaborative learning (CSCL) scripts using the IMS Learning Design specification. Key points:
- Researchers studied how to express CSCL macro-scripts and mechanisms using IMS Learning Design to make implementations more customizable and interoperable.
- Collaborative learning flow patterns from literature were coded in IMS Learning Design elements and attributes to demonstrate capabilities and limitations.
- An authoring tool called Collage was developed to allow designing processes based on learning design patterns and templates.
- Later work focused on embedding assessment activities, ensuring interoperability with IMS Question and Test Interoperability, and developing shared learning design environments and communities.
The document discusses using an "open design" approach to make better use of open educational resources (OERs) and technologies in learning design. It involves representing learning designs visually using tools like CompendiumLD to make the designs more explicit and shareable. Pedagogical patterns are also proposed as a way to structure designs and distill best practices. The approach was explored in workshops and aimed to help educators more effectively design pedagogically informed learning activities that leverage OERs and technologies.
Presented as part of the course on 'Information visualization' (533c) taught by professor Tamara munzner at UBC in 2009.
http://people.cs.ubc.ca/~tmm/courses/533-09/
More info @ http://www.cs.ubc.ca/~mohanr/
Ontology-based Semantic Approach for Learning Object RecommendationIDES Editor
The main focus of this paper is to apply an ontologybased
approach for semantic learning object recommendation
towards personalized e-learning systems. Ontologies for
learner model, learning objects and semantic mapping rules
are proposed. The recommender can be able to provide
individually learning object by taking the learner preferences
and styles, which used to adjust or fine-tune in learning object
recommending process. In the proposed framework, we
demonstrated how the ontologies can be used to enable
machines to interpret and process learning resources in
recommendation system. The recommendation consists of four
steps: semantic mapping between learner and learning
objects, preference score calculation, learning object ranking
and recommending the learning object. As a result, a
personalized and most suitable learning object is
recommended to the learner.
The document discusses domain modeling for personalized learning. It defines a domain model as representing domain knowledge through concepts and their relationships. Domain models serve as the basis for individual student models and for indexing and classifying learning content. They can be used to model student knowledge and decide on appropriate next steps for learning. The document describes different types of domain models, including vector, network, conceptual, and procedural models. It also discusses using ontologies and different aspects in domain modeling and applying domain models to student modeling, content indexing, and personalized guidance.
Experimenting with eXtreme Design (EKAW2010)evabl444
The document reports on an experiment evaluating the use of Content Ontology Design Patterns (ODPs) and the eXtreme Design (XD) methodology and tools. The experiment confirmed previous findings that Content ODPs improve ontology quality and reduce common mistakes. It also found that the XD tools support reuse of ODPs and that the XD methodology further decreases mistakes through its test-driven approach. Areas for future work include improving collaboration support and evaluating the methodology on other tasks.
Structuration of Personal Learning EnvironmentsMarco Kalz
Lecture given in the context of the MUPPLE lecture series at The Open University. If you want to download the slides please see here http://hdl.handle.net/1820/3433.
The document discusses ontologies, including their definition, purpose, and typical engineering process. It provides examples of existing ontologies such as DBpedia, Wikidata, and WordNet. It also outlines some key activities for developing ontologies, such as finding relevant existing ontologies, selecting which to use or extend, and adjusting or expanding them as needed. Some basics of ontology conceptualization are also introduced, such as modeling classes, instances, attributes, and relationships between classes.
Topic modeling is a technique for discovering hidden semantic patterns in large document collections. It represents documents as probability distributions over latent topics, where each topic is characterized by a distribution over words. Two common probabilistic topic models are latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (pLSA). LDA assumes each document exhibits multiple topics in different proportions, with topics modeled as distributions over words. Topic modeling provides dimensionality reduction and can be applied to problems like text classification, collaborative filtering, and computer vision tasks like image classification.
The document discusses the LINHD Digital Humanities Innovation Lab at UNED and its goals of promoting innovation, information, consultancy, technological services, and training. It aims to pioneer the use of linked data and semantic web technologies through projects like DIREPO, which links different digital poetic repertoires using a common ontology modeled in OWL. This will improve visibility of researchers' work, efficiency, and opportunities for collaboration by making the data interoperable and accessible through a SPARQL endpoint.
How to model digital objects within the semantic webAngelica Lo Duca
These slides describe the general concept of semantic Web and Linked Data, then they illustrate the concept of digital object. Finally they give a use case.
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
Experimental work done regarding the use of Topic Modeling for the implementation and the improvement of some common tasks of Information Retrieval and Word Sense Disambiguation.
First of all it describes the scenario, the pre-processing pipeline realized and the framework used. After we we face a discussion related to the investigation of some different hyperparameters configurations for the LDA algorithm.
This work continues dealing with the retrieval of relevant documents mainly through two different approaches: inferring the topics distribution of the held out document (or query) and comparing it to retrieve similar collection’s documents or through an approach driven by probabilistic querying. The last part of this work is devoted to the investigation of the word sense disambiguation task.
Supporting the Interpretation of Enriched Audiovisual Sources through Tempora...TimelessFuture
This document discusses a project that developed a timeline prototype to help scholars explore enriched audiovisual content metadata, like automatic speech transcripts, in a temporal manner. An evaluation with 5 media studies scholars found the prototype facilitated exploratory searching but transparency about data limitations was important. Next steps involve integrating prototype elements into a digital research environment to support audiovisual analysis.
This document discusses improving the interpretability of RASA NLU models through machine learning techniques. It introduces interpretable machine learning and how tools like ScatterText and LIME can be used to analyze RASA NLU training data and models. These techniques help identify confusing intents, common words between intents, and explain model predictions. The goal is to troubleshoot models and refine training data to improve natural language understanding.
This presentation is about a lecture I gave within the "Software systems and services" immigration course at the Gran Sasso Science Institute, L'Aquila (Italy): http://cs.gssi.infn.it/.
http://www.ivanomalavolta.com
This document provides an overview of the ontology development process including the following key steps: requirements definition, term extraction, ontology conceptualization, initial and detailed model drafting, ontology implementation, non-ontological resource transformation, and ontology evaluation. It discusses considerations for each step such as tools, focus, and best practices.
Diagrammatic knowledge modeling for managers – ontology-based approachDmitry Kudryavtsev
Diagrams are an effective and popular tool for visual knowledge structuring. Managers also often use them to acquire and transfer business knowledge. There are many currently available diagrams or visual modeling languages for managerial needs, unfortunately the choice between them is frequently error-prone and inconsistent. This situation raises the next questions. What diagrams/ visual modeling languages are the most suitable for the specific type of business content? What domain-specific diagrams are the most suitable for the visualization of the particular elements of organizational ontology? In order to provide the answers, the paper suggests light-weight specification of diagrams and knowledge content types, which is based on the competency questions and ontology design patterns. The proposed approach provides the classification of qualitative business diagrams.
Kudryavtsev, D. V., Gavrilova, T. A. (2011). Diagrammatic knowledge modeling for managers – ontology-based approach. Accepted poster. International Conference on Knowledge engineering and Ontology Development, 26-29 October, 2011, Paris, France. P. 386-389.
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
This document discusses knowledge patterns, which are invariances or regularities that exist across different types of data and domains. It provides examples of knowledge patterns found in linguistic resources, data, interactions, and semantic resources. It also discusses using knowledge patterns as expertise units and how patterns can be represented at different levels of abstraction through morphisms. Finally, it discusses some examples of problems involving temporal and procedural patterns as well as anti-patterns to avoid in knowledge modeling.
This document provides an overview of ontologies and the semantic web. It defines ontologies as formal specifications of conceptualizations that are shared between people and computers. Ontologies provide a common vocabulary and conceptual structure to facilitate understanding between humans and machines. They allow different systems and communities to work together by providing shared definitions of concepts and relationships. The development of ontologies and the semantic web aims to make web resources more computer-readable and enable machines to better understand and process online information.
Prieto et al., 2010 - Recurrent Routines in the Classroom Madnesslprisan
Presentation of the paper at the "Current challenges in learning design and pedagogical patterns research" symposium in the NLC 2010 conference in Aalborg, Denmark
Linking Research and Education in Digital Libraries: students’ perspectivesGetaneh Alemu
This presentation was given by Getaneh Alemu at TPDL-2011 workshop on “Linking Research and Education in Digital Libraries", held 28-29 September 2011 in Berlin. Getaneh was invited by the workshop organisers (Vittore Casarosa, Donatella Castelli and Anna Maria Tammaro) to present his perspectives and experiences in digital library education and research. For more information about the workshop http://www.dlib.org/dlib/november11/casarosa/11casarosa.html
Slides of the paper Deep Learning-Based Morphological Taggers and Lemmatizers for Annotating Historical Texts by Helmut Schmid at the 3rd Edition of the DATeCH2019 International Conference
This document discusses using text models to improve the accuracy of optical character recognition (OCR) on Chinese rare books. It conducted experiments using n-gram, backward/forward n-gram, and LSTM models on OCR data from ancient medicine books. The backward and forward 4-gram model achieved the highest correction rate at 97.57%. Mixing the LSTM 6-gram model with the OCR's top 5 candidates and probability of the top candidate further improved accuracy to 97.71%, demonstrating that combining text models with OCR probabilities can better correct OCR errors than text models alone. In conclusion, text models are effective for increasing OCR accuracy on rare books, with backward/forward 4-gram and LSTM 6-gram
More Related Content
Similar to Datech2014 Session 2 - Reflections on Cultural Heritage and Digital Humanities
Ontology-based Semantic Approach for Learning Object RecommendationIDES Editor
The main focus of this paper is to apply an ontologybased
approach for semantic learning object recommendation
towards personalized e-learning systems. Ontologies for
learner model, learning objects and semantic mapping rules
are proposed. The recommender can be able to provide
individually learning object by taking the learner preferences
and styles, which used to adjust or fine-tune in learning object
recommending process. In the proposed framework, we
demonstrated how the ontologies can be used to enable
machines to interpret and process learning resources in
recommendation system. The recommendation consists of four
steps: semantic mapping between learner and learning
objects, preference score calculation, learning object ranking
and recommending the learning object. As a result, a
personalized and most suitable learning object is
recommended to the learner.
The document discusses domain modeling for personalized learning. It defines a domain model as representing domain knowledge through concepts and their relationships. Domain models serve as the basis for individual student models and for indexing and classifying learning content. They can be used to model student knowledge and decide on appropriate next steps for learning. The document describes different types of domain models, including vector, network, conceptual, and procedural models. It also discusses using ontologies and different aspects in domain modeling and applying domain models to student modeling, content indexing, and personalized guidance.
Experimenting with eXtreme Design (EKAW2010)evabl444
The document reports on an experiment evaluating the use of Content Ontology Design Patterns (ODPs) and the eXtreme Design (XD) methodology and tools. The experiment confirmed previous findings that Content ODPs improve ontology quality and reduce common mistakes. It also found that the XD tools support reuse of ODPs and that the XD methodology further decreases mistakes through its test-driven approach. Areas for future work include improving collaboration support and evaluating the methodology on other tasks.
Structuration of Personal Learning EnvironmentsMarco Kalz
Lecture given in the context of the MUPPLE lecture series at The Open University. If you want to download the slides please see here http://hdl.handle.net/1820/3433.
The document discusses ontologies, including their definition, purpose, and typical engineering process. It provides examples of existing ontologies such as DBpedia, Wikidata, and WordNet. It also outlines some key activities for developing ontologies, such as finding relevant existing ontologies, selecting which to use or extend, and adjusting or expanding them as needed. Some basics of ontology conceptualization are also introduced, such as modeling classes, instances, attributes, and relationships between classes.
Topic modeling is a technique for discovering hidden semantic patterns in large document collections. It represents documents as probability distributions over latent topics, where each topic is characterized by a distribution over words. Two common probabilistic topic models are latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (pLSA). LDA assumes each document exhibits multiple topics in different proportions, with topics modeled as distributions over words. Topic modeling provides dimensionality reduction and can be applied to problems like text classification, collaborative filtering, and computer vision tasks like image classification.
The document discusses the LINHD Digital Humanities Innovation Lab at UNED and its goals of promoting innovation, information, consultancy, technological services, and training. It aims to pioneer the use of linked data and semantic web technologies through projects like DIREPO, which links different digital poetic repertoires using a common ontology modeled in OWL. This will improve visibility of researchers' work, efficiency, and opportunities for collaboration by making the data interoperable and accessible through a SPARQL endpoint.
How to model digital objects within the semantic webAngelica Lo Duca
These slides describe the general concept of semantic Web and Linked Data, then they illustrate the concept of digital object. Finally they give a use case.
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
Experimental work done regarding the use of Topic Modeling for the implementation and the improvement of some common tasks of Information Retrieval and Word Sense Disambiguation.
First of all it describes the scenario, the pre-processing pipeline realized and the framework used. After we we face a discussion related to the investigation of some different hyperparameters configurations for the LDA algorithm.
This work continues dealing with the retrieval of relevant documents mainly through two different approaches: inferring the topics distribution of the held out document (or query) and comparing it to retrieve similar collection’s documents or through an approach driven by probabilistic querying. The last part of this work is devoted to the investigation of the word sense disambiguation task.
Supporting the Interpretation of Enriched Audiovisual Sources through Tempora...TimelessFuture
This document discusses a project that developed a timeline prototype to help scholars explore enriched audiovisual content metadata, like automatic speech transcripts, in a temporal manner. An evaluation with 5 media studies scholars found the prototype facilitated exploratory searching but transparency about data limitations was important. Next steps involve integrating prototype elements into a digital research environment to support audiovisual analysis.
This document discusses improving the interpretability of RASA NLU models through machine learning techniques. It introduces interpretable machine learning and how tools like ScatterText and LIME can be used to analyze RASA NLU training data and models. These techniques help identify confusing intents, common words between intents, and explain model predictions. The goal is to troubleshoot models and refine training data to improve natural language understanding.
This presentation is about a lecture I gave within the "Software systems and services" immigration course at the Gran Sasso Science Institute, L'Aquila (Italy): http://cs.gssi.infn.it/.
http://www.ivanomalavolta.com
This document provides an overview of the ontology development process including the following key steps: requirements definition, term extraction, ontology conceptualization, initial and detailed model drafting, ontology implementation, non-ontological resource transformation, and ontology evaluation. It discusses considerations for each step such as tools, focus, and best practices.
Diagrammatic knowledge modeling for managers – ontology-based approachDmitry Kudryavtsev
Diagrams are an effective and popular tool for visual knowledge structuring. Managers also often use them to acquire and transfer business knowledge. There are many currently available diagrams or visual modeling languages for managerial needs, unfortunately the choice between them is frequently error-prone and inconsistent. This situation raises the next questions. What diagrams/ visual modeling languages are the most suitable for the specific type of business content? What domain-specific diagrams are the most suitable for the visualization of the particular elements of organizational ontology? In order to provide the answers, the paper suggests light-weight specification of diagrams and knowledge content types, which is based on the competency questions and ontology design patterns. The proposed approach provides the classification of qualitative business diagrams.
Kudryavtsev, D. V., Gavrilova, T. A. (2011). Diagrammatic knowledge modeling for managers – ontology-based approach. Accepted poster. International Conference on Knowledge engineering and Ontology Development, 26-29 October, 2011, Paris, France. P. 386-389.
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
This document discusses knowledge patterns, which are invariances or regularities that exist across different types of data and domains. It provides examples of knowledge patterns found in linguistic resources, data, interactions, and semantic resources. It also discusses using knowledge patterns as expertise units and how patterns can be represented at different levels of abstraction through morphisms. Finally, it discusses some examples of problems involving temporal and procedural patterns as well as anti-patterns to avoid in knowledge modeling.
This document provides an overview of ontologies and the semantic web. It defines ontologies as formal specifications of conceptualizations that are shared between people and computers. Ontologies provide a common vocabulary and conceptual structure to facilitate understanding between humans and machines. They allow different systems and communities to work together by providing shared definitions of concepts and relationships. The development of ontologies and the semantic web aims to make web resources more computer-readable and enable machines to better understand and process online information.
Prieto et al., 2010 - Recurrent Routines in the Classroom Madnesslprisan
Presentation of the paper at the "Current challenges in learning design and pedagogical patterns research" symposium in the NLC 2010 conference in Aalborg, Denmark
Linking Research and Education in Digital Libraries: students’ perspectivesGetaneh Alemu
This presentation was given by Getaneh Alemu at TPDL-2011 workshop on “Linking Research and Education in Digital Libraries", held 28-29 September 2011 in Berlin. Getaneh was invited by the workshop organisers (Vittore Casarosa, Donatella Castelli and Anna Maria Tammaro) to present his perspectives and experiences in digital library education and research. For more information about the workshop http://www.dlib.org/dlib/november11/casarosa/11casarosa.html
Similar to Datech2014 Session 2 - Reflections on Cultural Heritage and Digital Humanities (20)
Slides of the paper Deep Learning-Based Morphological Taggers and Lemmatizers for Annotating Historical Texts by Helmut Schmid at the 3rd Edition of the DATeCH2019 International Conference
This document discusses using text models to improve the accuracy of optical character recognition (OCR) on Chinese rare books. It conducted experiments using n-gram, backward/forward n-gram, and LSTM models on OCR data from ancient medicine books. The backward and forward 4-gram model achieved the highest correction rate at 97.57%. Mixing the LSTM 6-gram model with the OCR's top 5 candidates and probability of the top candidate further improved accuracy to 97.71%, demonstrating that combining text models with OCR probabilities can better correct OCR errors than text models alone. In conclusion, text models are effective for increasing OCR accuracy on rare books, with backward/forward 4-gram and LSTM 6-gram
Slides of the paper Turning Digitised Material into a Diachronic Corpus: Metadata Challenges in the Nederlab Project by Katrien Depuydt and Hennie Brugman at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Standoff Annotation for the Ancient Greek and Latin Dependency Treebank by Giuseppe Celano at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Using lexicography to characterise relations between species mentions in the biodiversity literature by Sandra Young at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Implementation of a Databaseless Web REST API for the Unstructured Texts of Migne's Patrologia Graeca with Searching capabilities and additional Semantic and Syntactic expandability by Evagelos Varthis, Marios Poulos, Ilias Yarenis and Sozon Papavlasopoulos at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Curation Technologies for a Cultural Heritage Archive: Analysing and transforming a heterogeneous data set into an interactive curation workbench by Georg Rehm, Martin Lee, Julián Moreno Schneider and Peter Bourgonje at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Cross-disciplinary collaborations to enrich access to non-Western language material in the Cultural Heritage sector by Tom Derrick and Nora McGregor at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Tribunal Archives as Digital Research Facility (TRIADO): new ways to make archives accessible and useable by Anne Gorter, Edwin Klijn, Rutger Van Koert, Marielle Scherer and Ismee Tames at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Improving OCR of historical newspapers and journals published in Finland by Senka Drobac, Pekka Kauppinen and Krister Lindén at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Towards a generic unsupervised method for transcription of encoded manuscripts by Arnau Baró, Jialuo Chen, Alicia Fornés and Beáta Megyesi at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Towards the Extraction of Statistical Information from Digitised Numerical Tables - The Medical Officer of Health Reports Scoping Study by Christian Clausner, Apostolos Antonacopoulos, Christy Henshaw and Justin Hayes at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Detecting Articles in a Digitized Finnish Historical Newspaper Collection 1771–1929: Early Results Using the PIVAJ Software by Kimmo Kettunen, Teemu Ruokolainen, Erno Liukkonen, Pierrick Tranouez, Daniel Antelme and Thierry Paquet at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper OCR-D: An end-to-end open-source OCR framework for historical documents by Clemens Neudecker, Konstantin Baierer, Maria Federbusch, Kay-Michael Würzner, Matthias Boenig, Elisa Hermann and Volker Hartmann at the 3rd Edition of the DATeCH2019 International Conference
- The document describes a project to fill gaps in knowledge about diamond mining, trading, and polishing in Borneo by developing a workflow using various CLARIAH tools and resources.
- The workflow involved digitizing a diamond encyclopedia, extracting concepts and place names, linking the data to external sources to create linked open data, and querying newspaper archives to build a corpus of relevant articles.
- Promising results showed mining, trading, and polishing continued in Borneo for Southeast Asian customers, and described previously unknown diamond fields and polishing locations in Borneo. The project aims to apply the workflow to other commodities like sugar.
Slides of the paper Automatic Reconstruction of Emperor Itineraries from the Regesta Imperii by Juri Opitz, Leo Born, Vivi Nastase and Yannick Pultar at the 3rd Edition of the DATeCH2019 International Conference
Slides of the paper Automatic Semantic Text Tagging on Historical Lexica by Combining OCR and Typography Classification by Christian Reul, Sebastian Göttel, Uwe Springmann, Christoph Wick, Kay-Michael Würzner and Frank Puppe at the 3rd Edition of the DATeCH2019 International Conference
This document describes the SOS system for segmenting, stemming, and standardizing Arabic text. It presents the challenges of processing Arabic cultural heritage texts which contain orthographic variations. The system uses gradient boosting machines and achieves state-of-the-art performance on segmentation and derives stemming as a byproduct. It also standardizes orthography with high accuracy, which further improves segmentation. The system addresses issues like hamza forms and letter confusions that previous systems did not handle well.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Datech2014 Session 2 - Reflections on Cultural Heritage and Digital Humanities
1. Reflections on
Cultural Heritage and Digital
Humanities:
Modelling in Practice and
Theory
Dr Arianna Ciula
University of Roehampton
UK
arianna.ciula@roehampton.ac.uk
Dr Øyvind Eide
Universität Passau
Germany
oyvind.eide@uni-passau.de
2. Scope and Aims
• Compare modelling traditions in Cultural
Heritage and Digital Humanities
• Our paper today → investigation into some
modelling practices
• Longer term: comparing the communities
• What is meant by modelling and models?
• How are modelling languages and theories created
and used?
3. Background on Modelling
● Ambiguity of term 'data model' in digital modelling
– from database models to conceptual model
● Process (dynamic nature and epistemic value) vs.
products (data models)
– modelling vs. model
● Models of vs. models for
● Theoretical background
6. Modelling in DH (textual) →
TEI
• Textual features
• No assumption on reference function
• Overview
• From 1987 Research Project, first release 1990, from
2001 TEI Consortium
• One part ISO standard
• XML formalism
• Organisation
• Community
• Modelling as document analysis
• reflects semantics of the standard and contingent
theories/practices
7. Modelling in cultural heritage
(museum documentation) → CIDOC
CRM
• Real world objects as represented in museum
information systems
• Overview
• CIDOC established 1950: museum documentation standards
• From 1996: Conceptual Reference Model, first release 1999
• ISO standard
• Openness with respect to formalism
• Organisation
• Community
• Ontology or conceptual model
• Modelling as mapping
• reflects semantics of the standard and contingent
theories/practices
8. Pragmatic links between the
two standards
●
TEI SIG ontologies
●
To facilitate mapping and integration
●
Established in 2004
●
Focus on links between TEI and external ontologies
●
Previous comparisons between TEI and CIDOC-CRM at
class level
●
Projects to account for and process textual
mobility
9. • TEI XML
• Physical and logical
structure
• Semantic content
• RDF/OWL ontology
• Network of associations
• Additional statements
and interpretative
layers
<rs key="abjuration" type="subject">on the day he abjured the
kingdom<persName key="rumberue_de_thomas">Thomas de
<placeNamekey="rumberue">Rumberue</placeName></persName></rs>
<persName key="ashford_de_william">William de
<placeName key="ashford1">Ashford</placeName>
</persName>
Henry III Fine Rolls
Project
11. Models for and models of
●
Main purpose of these standards
●
Models for (users)
●
Less evident to users
●
Models of (creators - but affects use)
●
Both perspectives are needed in order to understand
differences between the standards
●
how they are presented
●
how they are formalised
●
how they can be used
13. TEXTS
text as idea, intention, meaning, semantics, sense, content
TEXTL text as linguistic
code, as series of
words, as speech
TEXTD
text as document:
physical, material,
individual
TEXT
V
text as a visual object,
as a complex sign
TEXTG
text as a version of ..., as a set of graphs, graphemes,
glyphs, characters, etc. (... having modes ...)
TEXTW
text as a work, as
rhetoric structure
Sahle (2012)
luralistic model of text
15. Pragmatic links - Place name
in TEI
• Name as reference vs. name as source for
onomastic studies, linguistic analysis,
etymology etc.
• Semantic aspects (comparable with
CIDOC-CRM)
Madrid
<p>A conference in
<placeName>Madrid</placeName>.</p>
<nym>
<form>Madrid</form>
</nym
<place>
<placeName>Madrid</placeName>
</place
16. CIDOC-CRM
participate in
E39 Actors
(persons, inst.)
E55 Types
E28 Conceptual Objects
E18 Physical Things
E2 Temporal Entities
(Events)
E41Appellations
refer to / refine
referto/identifie
have location
within
E53 PlacesE52 Time-Spans
at
affect or refer to
19. Place names in TEI and
CIDOC-CRM
TEI:
● Usually located in
the context of other
words and marked
up “on location”
● Can also be data
driven
● Hierarchy of
content objects
● Links crossing
hierarchy: from tree
CIDOC-CRM:
● Located in the
context of an
information system
● Class hierachy with
multiple inheritance
● Object graph
20. TEI CIDOC-CRM
Modelling scope expansive focused
Modelling components Descriptive and
interpretative encoding at
same level
Division between the
model as a set of
statements about reality
and interpretative
argument
Modelling discourse Loose and flexible
stucture, mostly
structured by natural
language
Formal ontology, strict
(but multiple) iheritance,
multiple instanciation
Presentation Scopenotes for each
element, narrative texts
describing use as
processes, examples
Scope notes, short
examples, graphical
presentation of class and
object hierachies
Playing different games
21. Thank you!
Dr Arianna
Ciula
University of
Roehampton
UK
arianna.ciula@roehampton.
ac.uk
Dr Øyvind Eide
Universität Passau
Germany
oyvind.eide@uni-passau.de