Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

G04 vassilis tzouvaras_mapping_with_mint

192 views

Published on

Creating mappings with the MINT system
Vassilis Trouvaras, NTUA

Published in: Internet
  • Be the first to comment

  • Be the first to like this

G04 vassilis tzouvaras_mapping_with_mint

  1. 1. The Mint Mapping tool The MoRe aggregator Vassilis Tzouvaras, Dimitris Gavrilis National Technical University of Athens Digital Curation Unit - IMIS, Athena Research Center LoCloud is funded by the European Commission's ICT Policy Support Programme
  2. 2. Cultural Heritage Content • Diversity of cultural heritage content – Numerous metadata schemas to annotate content (LIDO, CIDOC-CRM, EAD, METS ) • Massive digitization and annotation activities are in progress • Need for interoperability
  3. 3. MINT Mapping Tool • Provides users the ability to perform a mapping of their own metadata schemas to reference domain models • Follows a typical web based architecture • It was developed for ATHENA, but it is currently used for EUScreen, CARARE, Judaica, ECLAP, DCA and Linked Heritage
  4. 4. MINT 2 – What’s new? • The backend was reconstructed for better performance – File size for imports is extended • The frontend was updated – New interface – Workflow is integrated in UI – Facilitated browsing of input and target schema
  5. 5. MORe Overall Architecture Registry Apache Cassandra cluster Fedora-commons Temporary storage Vocabulary services Storage JMS logging Messaging Core services Enrichment service management Entity matching / NLP Geocoding / Historic Place names REST External enrichment services Publish service management OAI-PMH RDF Store Elastic Search Archive
  6. 6. Cloud architecture • De-centralized • Scalable • Four cloud environmets – Storage – Monitoring & logging – Core services deployment – Enrichment services deployment
  7. 7. Distributed • Enrichment services run on: – Austria – Spain – Greece – Lithuania – Slovenia – Norway • Scalability can be facilitated through a virtualization infrastructure
  8. 8. Workflow OAI-PMH LoCloud Collections Wikimedia MINT Harvest Ingest Transform Enrich Publish OAI-PMH Archive RDF Store SolR Validate Index Delete Reject Omeka
  9. 9. Intermediate Schemas Dublin Core LIDO CARARE EAD ESE EDM Dublin Core LIDO CARARE EAD ESE EDM OMEKA-XML OGD
  10. 10. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Harvests content from metadata sources OAI-PMH repository MINT LoCloud Collections Wikimedia Multiple schemas are supported OAI_DC CARARE CARARE 2.0 LIDO EAD EDM ESE
  11. 11. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Validates incoming information packages Executes validation schemes Validation micro-services Structure Schema Linking Schematron rules Flexible How it is used in MoRe: Pre-validation Post-validation
  12. 12. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Ingest content into storage Uses storage layer API Pluggable drivers for attaching different technologies / repositories Apache Cassandra Filesystem-based Fedora-commons Versioning support Complex digital object support
  13. 13. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Content Model Digital objects comprise data streams Each data stream can hold any kind of information • XML/RDF, Image, Video, Documents, etc. Each different representation of an information object is stored as a different data stream Each curation action generates a new version • Transformation, Enrichment
  14. 14. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Transforms entire information packages into the Europeana Data Model (EDM), or any other schema Multiple transformation routines Per schema Per project Per provider User can attach rights statement
  15. 15. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services The generic enrichment service facilitates the execution of the enrichment micro-services • Hides the complexity from the user by using enrichment plans • Provides seamless integration with the UI of MORE Virtual Enrichment driver • Allows developers/creative industries to create their own enrichment services and declare/use them within MoRe
  16. 16. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Preview the XML record information for all datastreams Preview the record in HTML (using the Europeana style sheet)
  17. 17. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Publish transformed / enriched information • Internal OAI-PMH provider • XML export • Publish directly to RDF repositories • Sesame • Virtuoso • SolR index server
  18. 18. • Thematic – Thesauri collections – Vocabulary matching – Background links • Spatial – Geo normalization – Geo coding – Reverse geo-coding – Historic place names • Other – Language identification Enrichment micro-services SKOS Thesauri Geo-Names DBPedia Wikipedia
  19. 19. Enrichment Plan • Enrichment micro-services are used within enrichment workflows: – Enrichment plans • Each enrichment plan applies to a specific schema • Each enrichment plan executes enrichment micro-services in a specific order Enrichment plans Language identification Vocabulary matching Geo-normalization Geo-coding
  20. 20. Enrichment Plan • Each enrichment plan defines run-time parameters for specific services – Content based Enrichment plans Language identification Vocabulary matching Geo-normalization Geo-coding Add subject collection A only if term X or Y are matched
  21. 21. Dashboard
  22. 22. Packages organization
  23. 23. Package overview
  24. 24. Package lifecycle overview
  25. 25. Preview
  26. 26. Metadata completeness & statistics
  27. 27. Enrichment services overview
  28. 28. Direct access to 27 thesauri Create & (re)use subject collections
  29. 29. Thank you tzouvaras@image.ntua.gr d.gavrilis@dcu.gr

×