DSpace Under the Hood

2,517 views

Published on

-DSpace Under the Hood-
As presented at OR10 in July 2010-
Part 1: How does DSpace work?: Whilst you don't need to be a mechanic to drive a car, it is helpful if you have a basic understanding of how a car works, what bits do different jobs, and how to top up your oil and pump up your tyres / tires. This presentation will give an overview of the DSpace architecture, and will give you enough knowledge to understand how DSpace works. By knowing this, you will also learn about ways DSpace could be used, and ways in which it can't be used.
Part 2: The development process and YOUR role in it:
DSpace development in undertaken by the DSpace community. No one, or no organisation is in charge, and without contributions from the DSpace community the platform would not continue to develop and evolve. Sometimes it can appear that there are people in charge, or that unless you are a technical developer then there is no way or need to contribute. This presentation will explain how DSpace development usually takes place, where and who has input at different stages, and will equip you to contribute further, or to help you contribute for the first time.

Presenters - members of the DSpace Committers and DSpace Global Outreach Committee:
Lewis, Stuart ; Hayes, Leonie ; Stangeland, Elin ; Shepherd, Kim ; Jones, Richard ; Roos, Monica

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,517
On SlideShare
0
From Embeds
0
Number of Embeds
49
Actions
Shares
0
Downloads
56
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

DSpace Under the Hood

  1. 1. Hood er the e U nd Sp acD How DSpace works
  2. 2. This presentation• Purpose • You don’t need to be a mechanic to drive a car… …but it helps if you know how to top-up your oil, check your tyres and explain problems to your mechanic• Two presentations • How DSpace Works • The DSpace Community Development Process (and YOUR role in it)• The presenters • 3 repository managers / DGOC members • 3 developers / committers • 29 years of combined DSpace experience ES
  3. 3. Contact details• Leonie Hayes – Research Repository Manager, The University of Auckland Library – l.hayes@auckland.ac.nz (1.5)• Richard Jones – Head of Repository Systems, Symplectic Ltd. – richard@symplectic.co.uk (1.0)• Stuart Lewis – IT Innovations Analyst and Developer, The University of Auckland Library – s.lewis@auckland.ac.nz (1.2)• Monica Roos – Special Librarian, The University of Bergen Library – monica.roos@ub.uib.no (1.4)• Kim Shepherd – IT Software Analyst and Developer, The University of Auckland Library, – k.shepherd@auckland.ac.nz (1.5)• Elin Stangeland – Repository Manager, Cambridge University Library – es444@cam.ac.uk (1.2) SL
  4. 4. A bird’s eye view ES
  5. 5. Data structures SL
  6. 6. The information architecture• AKA: “What goes where?”• Parts of an item are stored in different places • …for different reasons• Some data needs to be searched• Some data needs to be displayed• Some data is large LH
  7. 7. The Database RJ
  8. 8. Files on disk• Files stored in the ‘asset store’• Each bitstream has an internal id; a large integer derived from the date of submission and checksum o e.g. 908735856294756292618068267495783• Files are stored in the directory ab/dc/ef, where abcdef are the first 6 characters of the internal id o e.g 90/87/35• Files are stored with the internal id as the filename• All actual information about the file is in the "bitstream" table in the database, including internal id. 908735856294756292618068267495783 ->/dspace/[assetstore]/90/87/35/908735856294756292618068267495783 ES
  9. 9. Configuration• To make DSpace work in a local context• Enable local settings – urls, links to physical storage etc. • dspace.cfg • Lots of settings – read it one day!• Enable local configurations • Input forms • Metadata crosswalks • OAI-PMH configuration • Language packs RJ
  10. 10. Lucene search indexes• Maintained outside of the database• Contains the search index• Very fast for full text searching • Stop words and stemming• A cached copy • Can be re-built or updated from the database • [dspace]/bin/dspace index-update MR
  11. 11. Log files• The dspace.log file performs several functions as: - Providing log information useful in debugging - Providing information about who has accessed the system - It generates the classic DSpace statistics - Contain error messages useful for fixing problems LH
  12. 12. Solr statistics•Classic statistics were derived from log files •Only generated once a day •Dependant on keeping all your old log files•Solr stats •Statistical information entered real-time into solr index •Event-based, not log-based •Able to query statistics •Different views of statistics •Able to write new statistics reports SL
  13. 13. Scheduled tasks• Things need to happen in the background: – filter-media (generate full text search indexes, create image thumbnails) – sub-daily (send daily subscription emails) – stat-general / stat-monthly (generate classic statistics) – generate-sitemaps (creates search engine sitemaps) – checker (checksum checker) LH
  14. 14. Authentication• Authentication is about verifying who a user is • DSpace provides multiple options • In-built database of users and passwords • LDAP login (link to local LDAP or ActiveDirectory) • Shibboleth • IP authentication • Makes use of a plugin stack • You can chain authentication methods • You can create new authentication methods ES
  15. 15. Authorization• Authorization basics • Resource policies • EPeople - DSpace users • Groups • Anonymous • Admin • Special groups • Permissions • READ / WRITE • ADD / REMOVE RJ
  16. 16. JSPUI styling options• All JSPs: dspace-jspui/dspace- jspui/webapp/src/main/webapp• Main layout: [jsps]/layout and [jsps]/styles.css.jsp• Page localisations should go in dspace/modules/jspui • Caveat: The more changes you make, the harder it is to upgrade MR
  17. 17. XMLUI styling options• Composed of two components that control different part of the Manakin interface: • Aspects: • "plugins" that provide features for the repository • Themes: • "Rules" that defines where themes are installed in the repository. • Style the look-and-feel of the repository. SL
  18. 18. Error messages• Help – an error message! • To help you, we need to know exactly what went wrong • The easiest way to do this, is with a ‘stack trace’ • XMLUI: Shown on the error page • JSPUI: Shown in the HTML source • All: In the dspace.log file SL
  19. 19. Error messages• Help – an error message! • To help you, we need to know exactly what went wrong • The easiest way to do this, is with a ‘stack trace’2008-09-18 15 08:13 : ,263 INFO org .dspace.app.xmlu .u i s Authent t i tl . ica ionUt l @ [EMAIL i PR OTECTED] : sess ion_id=F1FB96AF6FA3464C393A3366621534A4: ip_addr=139.147.66.108: l in type=exp i i java lang l in rExcept og : l ct . .Nu lPo te ionat org .dspace.au hent te t ica .LDAPHierarch lAuthent t .getSpec lGroups(LDAPHie ica ica ion ia ra ica rch lAuthent t . :144) ica ion javaat org .dspace.au hent te t ica .Authent tica ionManager tSpec l .ge ia Groups(Authentca ionMa i t nager java . :308)at o .dspace.app.xmlu .u i s Authent t rg i tl . ica ionUt llog (Authent t i. In ica ionUtlja i. va:222) RJ
  20. 20. Getting help from dspace-techTo: dspace- tech @l ts. is sourceforge.netSubjec : Erro w th XYZ when ABC happens t r i Dear dspace-tech, [Descr t on o your prob ip i f lem] [Stack t ace and/or l fle ex r ] r og i t acts [DSpace ve ion rs ] [Env ronment OS favour i : l ] [Any other relevant deta l ] is Please he ! lp Richard MR
  21. 21. Hood er the e U nd Sp acD Any questions? ES
  22. 22. Hood er the e U nd Sp acDThe DSpace Community Development Process (and your role in it) RJ
  23. 23. DSpace community vs Commercial CompanyCore Development: DSpace Analogue:• Analysts • Repository Managers• Software Developers • In-house, external consultants, committers, contributors• Designers • Software Developers• Testers • Committers and community; testathon RJ
  24. 24. DSpace community vs Commercial CompanyProduct Environment: DSpace Analogue:• Writers • Repository Managers, Committers, DuraSpace• Marketing • DuraSpace, the Community• Sales • DuraSpace• Customer Support • Mailing Lists RJ
  25. 25. DSpace community vs Commercial CompanyOrganisational Support: DSpace Analogue:• Project Management • Repository Managers, DuraSpace• Finance and HR • Participating Institutions, DuraSpace ES
  26. 26. How the community fits together SL
  27. 27. DSpace committers group• Committer = able to ‘commit’ code to the code repository• General voting rights (usually opened to anyone)• Becoming a committer • Meritocracy • Vote by current committers • We keep an eye out for new potential committers • You can suggest yourself as a new committer • We need more committers! • Some committers are not developers • We like to see: – Dedication to DSpace (usually via employment) – Friendly and helpful (participate in email lists) – Contributes code – Joins in development dicussions • Some take years, some take only a few months ES
  28. 28. DSpace Global Outreach Committee• Repository managers group• Current projects o DSUG meetings at Open Repository Conferences o Community requirements gathering o DSpace instance database development• Engagement in development process LH
  29. 29. DSpace Ambassadors• A recent initiative with a focus on regionalnetworking• Focusing on connecting DSpace users together• Providing support, networking and mentoring fornew users• Further details: Contact Valorie Hollister MR
  30. 30. Developing code for DSpace• Submit patches• Follow guidelines https://wiki.duraspace.org/display/DSPACE/Guidelines+for+Committing• Dont throw over the fence and leave - engage with the committers• Inform the community what you’re doing• Ask for feedback, early and often ES
  31. 31. JIRA• Issue tracking and project tracking for software development • A worldwide tool used by many groups for many purposes • Organised around projects (e.g. DSpace 1.x), components (e.g XMLUI), issues (e.g. a sw bug) and workflows • Issue prioritisation • Voting ES
  32. 32. What goes in JIRA?• Bug Reports o Supply as much information as possible o If youre not sure, email dspace-tech first• Bug Fixes o If you think youve fixed a bug, attach a patch! o Encourage peer review and constructive suggestions• New Features o JIRA helps manage collaborative work on new features o Make use of "sub-tasks" and relationships ES
  33. 33. What doesn’t go in JIRA?• Technical questions/problems. Better places to get help are: • dspace-tech mailing list • #dspace on IRC• Suggestions/ideas not previously discussed • New ideas are always wanted, but... • Discuss them with the community first • Mailing lists, community surveys, IRC, conferences or user groups are good places to float new ideas LH
  34. 34. Versions• Major 1.0 (big changes, database changes) – Minor 1.6 (smaller changes) • Sub-minor 1.6.2 (bug fixes)• Upgrades routes: – If I’m on 1.3.2 and I want to get to 1.6.2 • 1.3.2 -> 1.4 • 1.4 -> 1.5 • 1.5 -> 1.6.2 SL
  35. 35. Release co-ordinator• Co-ordinates the release – Sets deadlines – Manages processes – Public relations – Deciding vote (very rarely used) – Tries to encourage particiation MR
  36. 36. How development works• Weekly IRC meetings • Anyone welcome• DSpace-devel email list• Processes continue to change and improve: – Recent changes: • No partial features in trunk • No new features without supporting documentation RJ
  37. 37. Post-development / Pre-release• Developer Testing • Eliminate the obvious code bugs• Community Testathon • Find user experience bugs, less obvious problems• Bug Fix • Not just for committers - contribute here too• Create release candidate • Start with x.y (e.g. 1.5)• Repeat as necessary • Increment sub-minor part of version number • (e.g. 1.5.2 is the second bugfix release for 1.5) LH
  38. 38. DSpace email lists• Send to an appropriate list• Include enough useful information• Leave it a while before prompting if no reply• Please don’t email people directly LH
  39. 39. DSpace email lists• Who lives in which house?• DSpace General (dspace-general) • General community discussion around digital repositories and related applications• DSpace Technical (dspace-tech) • Technical discussion/support around DSpace installation, configuration and operation• DSpace Development (dspace-devel) • Discussion/support around developing DSpace • Automated notices and alerts from JIRA • Suggestions for new features/improvements • Release announcements and updates MR
  40. 40. IRC•There are two "rooms" dedicated to Dspace 1. #dspace: For all general DSpace Questions and Answers 2. #duraspace: For committer, developer meetings and other DuraSpace activities.The easiest way to access theservice is from:http://webchat.freenode.net/ SL
  41. 41. See also…• The DSpace wiki resource page: • https://wiki.duraspace.org/display/DSPACE/DSpaceResources• DSpace manual: • https://wiki.duraspace.org/display/DSPACE/DSpaceResources#DSpaceResource s-DSpaceSystemDocumentation• The DSpace course: • http://hdl.handle.net/2160/615 LH
  42. 42. How can you play a part? What are you interests? What are your passions? What could you do? SL
  43. 43. Photo credits• Under the hood image: – http://www.flickr.com/photos/andrew_buckie/209493609/
  44. 44. Hood er the e U nd Sp acD Any questions?
  45. 45. Contact details• Leonie Hayes – Research Repository Manager, The University of Auckland Library – l.hayes@auckland.ac.nz (1.5)• Richard Jones – Head of Repository Systems, Symplectic Ltd. – richard@symplectic.co.uk (1.0)• Stuart Lewis – IT Innovations Analyst and Developer, The University of Auckland Library – s.lewis@auckland.ac.nz (1.2)• Monica Roos – Special Librarian, The University of Bergen Library – monica.roos@ub.uib.no (1.4)• Kim Shepherd – IT Software Analyst and Developer, The University of Auckland Library, – k.shepherd@auckland.ac.nz (1.5)• Elin Stangeland – Repository Manager, Cambridge University Library – es444@cam.ac.uk (1.2) SL

×