2011 ELLAK, EKT conference presentation,link at http://helios-eie.ekt.gr/EIE/handle/10442/8702

  • 699 views
Uploaded on

Permanent link @ http://helios-eie.ekt.gr/EIE/handle/10442/8702

Permanent link @ http://helios-eie.ekt.gr/EIE/handle/10442/8702

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Animations have been supressed pls use for reference http://hanlde.net/10442/8702
    Are you sure you want to
    Your message goes here
    Be the first to like this
No Downloads

Views

Total Views
699
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
1
Comments
1
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Building an open source based, open standards, infrastructure for the large scale provisioning of reusable open content Π. Σταθόπουλος , Ν. Χούσσος, Γ. Σταύρου , Ε. Σαχίνη, Ι.Ο. Σταθοπούλου, Κ. Σταμάτης, Α. Σουμπλής email: pstath; [email_address] Συνέδριο ΕΛ/ΛΑΚ 2011, Αθήνα Εθνικό Κέντρο Τεκμηρίωσης/Ε.Ι.Ε. National Documentation Center /N.H.R.F.
  • 2. EKT: The National Documentation Centre
  • 3. “Ανοιχτά Πρότυπα & Ανοιχτό Περιεχόμενο. Τα θεμέλια της ανάπτυξης του Internet” “ Building an open source based , open standards , infrastructure for the large scale provisioning of reusable open conten t”
    • “ reusable ” and “ open ” content:
      • functionality ?
      • 4. standards ?
      • 5. software ?
      • 6. emerging requirements ?
    • An open source software stack for delivery of
      • reusable, open content
      • 7. in large scale, using open standards
      • 8. and addressing emerging requirements
  • 9. Content types and characteristics (I)
    • Different content types but similar functional requirements, examples:
      • Research output , e.g. papers , reports, patents, documentation, conferences, books, data: didaktorika.gr , helios-eie.ekt.gr , etc.
      • 10. Cultural - Humanities repositories : from research papers to commented cultural artificats: pandektis.ekt.gr
      • 11. Educational – reference material : studies, books, multimedia courses, project outcomes, e.g. repository.edulll.gr
      • 12. Digitised and/or born digital e-books : ebooks.serrelib.gr ( pre-release )
      • 13. Library automation systems and OPACs: linking the digital with the physical realm, abekt.gr
      • 14. Digital objects in general
  • 15. Content types and characteristics (II)
    • Common characteristics:
        • Online databases providing access to digital to objects (e.g., books, articles, images, mutlimedia) accompanied with rich metadata
        • 16. Collection, dissemination, preservation of material that has archival value for future reference
    • Different requirements and emphasis in features implemented in comparison to Web Content Management System s
    • 17. Fast evolving landscape. Full range of features required:
      • copyright information management, metadata-bibliographic data management , metadata organisation and interoperability mechanisms ,highly automated backoffice digitisation processes support,digital content interoperability and reuse frameworks, persisted identifiers , end user interfaces and intuitive content interaction , highly efficient storage and computing resources management
  • 18.
    • Standards (… so far)
    • Open and reusable :
      • Digital content, file formats, encoders, etc.
      • 19. Metadata description:
        • UNIMARC , MARC21 to DC and MODS
      • Interoperability protocols and interfaces:
        • OAI-PMH , OAI-ORE , Z39.50, SRU/SRW, Europeana ESE , etc.
      • Unique and persistent identifiers
        • Independent from repository software,
        • 20. HANDLE.NET (RFC3652), openURL
  • 21. An open content open source software stack No easy task to tackle the full range of these issues, in a scalable and viable manner
    • Custom built - problem specific solutions pay a high toll for standards compliance, locked in a specific solution/architecture
    Solution : best of breed open source systems for the assembly of a fully open source open content delivery software stack
    • Not trivial task - or just an integration/deployment issue
    • 22. High quality FLOSS tools are available: pluggable and modular stack, do one thing but do it at “world class”
    FLOSS/Open Standards is the enabler for scalable, viable, large scale open content infrastructures . Just like the Internet no long term alternative to open standards/FLOSS seems to exist!
  • 23. … viewers and enhanced user experiences
  • 24. … novel reading devices
  • 25. … mashups and linked data
  • 26. The open software stack components best of breed: open source, setting the standards pace, world class features, scalable and robust communities
  • 27. Case studies
    • Examples of “what can open source do for you”
    • 28. 3 case studies:
      • The Education and Life Long Learning Repository
        • Persistent Ids, rich user experience/custom tag bilingual tag clouds based on documentation metadata
      • The e-OPAC “alpha” version of openABEKT
        • Connecting the digital with the physical realms
      • Public Library e-books repository pilot: the Serres Public Library pilot
        • Backwards interoperability, repositories, enhanced viewers and image servers infrastructure
  • 29. repository.edulll.gr
  • 30.
    • http://ebooks.serrelib.gr
  • 31. http://hdl.handle.net/10812/1929
  • 32. ABEKT e-OPAC
  • 33. Enhanced image viewer/server infrastructure Data webnode(s) pool datanode(s) pool Shared Storage: 1 Master (RW) + N “slaves” (RO) Currently: 300K pages, 3M imminent from didaktorika.gr, 10 resolutions/page EKT's virtualised infrastructure Shared Storage Backend Processing Frontend Load Balancer Backend Processing Internet Requests
  • 34. Characteristics • Caching • Load Balancing • Error compensation through VCL • Backend Health Check Varnish • Lightweight & Simple • Stable • Right for the job NginX • On the fly Image Processing • No need for several Images with different resolutions • Open and Extensible API Djatoka Stateless “ Infinite” Scalability Highly Available End to End FLOSS Layer Security Apache Tomcat/DSpace, NGINX and Varnish cache engine in high availability configuration, PostgreSQL DB, on CentOS VMs
  • 35. (open)ABEKT e-OPAC
    • ABEKT new e-OPAC
      • Different need: manage the library (digital and physical) using bibliographic tools and a librarian approach
      • 36. The interoperability layer between the physical and the digital library
      • 37. “ Richer” metadata and information
    • ROR over OSS components
  • 38. Digital repositories software
    • Dspace 2.0: with Fedora inside
      • EKT participates at the design team
    • Dspace + Fedora = Duraspace
    • 39. Advanced content provisioning and preservation functionalities
    • 40. The core element for an interoperable repositories infrastructure
    • 41. Preparing for operation as SaaS and over IaaS Cloud
      • Delivers as SaaS a complex service
  • 42. Future work and informational links
    • Current and forthcoming work
      • Viewer with full text search: OCR XML
      • 43. Back port features to existing repositories
      • 44. Integrate to OJS the page by page functionality
      • 45. openABEKT
      • 46. Automate already existing economies of scale: offer SaaS capabilities for the whole range of tools
    • Links
      • http://www.duraspace.org/
      • 47. http://openlibrary.org/dev/docs/bookreader
      • 48. http://www.indexdata.com/zebra
  • 49.
      Thank you for your attention!