Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
DSpace 4.2
Advanced
Training
DSpace 4.2 Advanced Training by James Creel is licensed under a
Creative Commons Attribution ...
Course Outline
• Day 1: Quick Review of DSpace Basics
DSpace Configuration
• Day 2: DSpace Content Transmission
• Day 3: T...
Why XMLUI?
• Customizable, Modular, Extensible
• Themes
• Aspects
• Specific features
• Harvesting
• Controlled vocabulary
What goes into a DSpace
installation?
Source Code
Installer
Installation
Apache Tomcat Webserver
mvn package
ant build
Web...
What goes into a DSpace
installation (1/4) – Source
Code
• Get it on sourceforge (zip) or GitHub
• Where does the source c...
What goes into a DSpace
installation (2/4) - Installer
• Using mvn package from the [dspace-src-dir] we create the “instal...
What goes into a DSpace
installation (3/4) - Installation
• From the installer directory, we can do a new installation wit...
What goes into a DSpace
installation (4/4) - Webapps
The webapps under
[dspace-installation-dir]webapps
should then be dep...
Structuring the
Repository Content
• Communities can contain sub-communities and collections
• Collections contain only it...
Structuring the Repository
Items available at 3rd tier
Communities and collections
Top level community or
communities
Comm...
Some preliminaries
• Log in as admin@admin.com, password “admin”
• From the homepage,
create the first top-level community...
Accessing Content via XMLUI
• Communities, collections, and items are addressable by
handles.
• Bitstreams are addressable...
Let’s view the item
• Click through the community/collection hierarchy to find the
item
• It is addressed via handle, for ...
Some basic functionality for review
• Moving and mapping items
• Editing metadata individually and in batch
• Managing epe...
Advanced DSpace 4.2
Training - Configuration
DSpace 4.2 Advanced Training by James Creel is licensed under a
Creative Comm...
Configuring DSpace 4.2
and the XMLUI
• Themes and Aspects
• Emails
• Authentication
• Item Submission
• Affordances for In...
Themes and Aspects
• Themes and aspects are applied in the
dspaceconfigxmlui.xconf configuration file.
• The root element,...
Theming for the XMLUI
• Inside the xmlui webapp, you can find a themes directory
containing the themes.
• Each theme is co...
Theming for the XMLUI
• Within your xmlui.xconf file, have a look at the <themes>
section.
• Use the handle attribute to a...
Theming for the XMLUI -
Internationalization
• Within the xmlui/i18n directory, you will find the
messages.xml file.
• As ...
Aspects in the XMLUI
• Within your xmlui.xconf file, have a look at the
<aspects> section.
• Aspects are Java code that in...
System Emails (1/4) - Configure
DSpace to send email
• Open jEdit and bring up dspace.cfg
• Set the following values:
• ma...
System Emails (2/4) - Check that
email works
• DSpace provides a nice command line facility to check email
functionality w...
System Emails (3/4) - DSpace
email templates
• Once email is working, customizing email texts is one of the
easiest custom...
System Emails (4/4) – Try out an
email template modification
• Make some modifications to the register email template.
• N...
Authentication - Introduction
• Configuring the Authentication Stack
• IP-based authentication
• Institutional authenticat...
Configuring IP-based
authentication (1/2)
• Create a new eperson group for your DSpace repository.
• Pick an item and set ...
Configuring IP-based
authentication (2/2)
• Bring up [dspace-install-
dir]configmodulesauthentication.cfg
• Find the stack...
Configuring Attribute-based
Authentication
• LDAP and Shibboleth are two institutional authentication
schemes that can be ...
Configuring Attribute-based
Authentication – LDAP attributes
• Configure with the file [dspace-install-
dir]configmodulesa...
Configuring Attribute-based
Authentication – LDAP Group
Mapping
• All LDAP authenticated users can be put into a specific ...
Configuring Attribute-based
Authentication – Shibboleth
Attributes
• Configure with authentication-shibboleth.cfg
• Two ty...
Configuring Attribute-based
Authentication – Shibboleth Role-
based Group Mapping
• Shibboleth attributes may be scoped by...
Final Thoughts on Authentication
• Centralized methods require coordination with Campus or
Department IT.
• Groups are det...
Submission Workflows - Topics
• Creating new submission workflows
• Applying customizations to specific collections
• Crea...
Submission Workflow (1/5) –
modifying the workflow
• Use jEdit to open the file C:dspaceconfigitem-
submission.xml
• There...
Submission Workflow (2/5) –
modifying the workflow
• Consider the <submission-definitions> section at
the bottom.
• The de...
Submission Workflow (3/5) –
modifying the workflow
• Copy the traditional submission process in order to make a
new <submi...
Submission Workflow (4/5)-
Customizing the submission form
for a specific collection
• Obtain the handle of a non-harveste...
Submission Workflow (5/5) –
Enabling Advanced Embargo
• Enabling this feature requires the additional step of setting
webu...
Creating Custom Input Forms
(1/5)
• Use jEdit to open the file C:dspaceconfiginput-
forms.xml
• Similarly to the item-subm...
Creating Custom Input Forms
(2/5)
• DSpace 4.2 has a traditional form with exactly 2 pages (recall
that there are two page...
Creating Custom Input Forms
(3/5)
• The pages can be modified in a number of ways –
• Target metadata field (schema, eleme...
Creating Custom Input Forms
(4/5)
• Let’s create a dropdown for dc.format (or another field if
you like)
• First, make a <...
Custom Input Forms (5/5) -
Customizing the Submission
Workflow for Item Types
• The element <type-bind>, containing comma-...
Controlled Vocabularies for Input
Fields (1/3)
• With controlled vocabularies in DSpace, submitters can
choose specific st...
Controlled Vocabularies for Input
Fields (2/3)
• Here is the grammar for the XML files:
• The root element is a <node> ele...
Controlled Vocabularies for Input
Fields (3/3)
• Make an XML file with, say, 3 <node> elements and save it to
the controll...
Enabling Information Seeking -
Outline
• Faceted search (aka “Discovery”)
• Handles
• DOIs
• Deprecated Legacy Search/Brow...
Discovery (i.e. facets)
• A popular paradigm for information-seeking
tasks.
• The user selects a value for a metadata fiel...
Configuring Discovery (1/7)
• Configuration for the discovery feature may be found in
[dspace-install-
dir]configspringapi...
Configuring Discovery (2/7)
• By analogy with the configuration of input forms and
workflows which are mapped to specific ...
Configuring Discovery (3/7)
There are two privileged key attribute values that are not handles
used in the entry elements:...
Configuring Discovery (4/7)
• The discovery.xml file comes stock with a bean of id
defaultConfiguration which is used by t...
Configuring Discovery (5/7)
• Under the comment
we find the beans representing our facets.
• Within the org.dspace.discove...
Configuring Discovery (6/7)
• Within a given Discovery configuration bean, there are two
crucial lists:
• It is by adding ...
Configuring Discovery (7/7)
• As a simple test of Discovery configuration, we can take a
freshly duplicated bean of the de...
Indexing for Discovery
• A Solr webapp is provided in Tomcat’s webapps directory. See
that Tomcat is running it at http://...
Try out some faceted search
• Restart and bring back up your repository.
• Your Discovery facets will appear on the right ...
Naturally, if it works on the
sidebar, it works in the search
interface.
Final Considerations on
Discovery
• For customizing the search/browse affordances on a collection
basis, a far more lightw...
Handles
• Once you have a handle prefix assigned by CNRI (as described
in the Server Configuration section below), set you...
Digital Object Identifiers
(DOIs) in DSpace
• May be used in parallel with the handle functionality that is
inextricable f...
Digital Object Identifiers
(DOIs) with DataCite
• Lots to configure - assigned prefix, credentials, and update
event handl...
Digital Object Identifiers
(DOIs) with EZID
• Less to configure – uncomment the EZID block in dspace.cfg
• Uncomment beans...
Activating Legacy
Search/Browse
• Open the file C:dspaceconfigxmlui.xconf
• Comment out the Discovery aspect and uncomment...
Changing the Browse Indexes (1/4)
• Return to the dspace.cfg file.
• Track down where the webui.browse.index.n
entries beg...
Changing the Browse Indexes (2/4)
• As it happens, we haven’t initialized our browse and browse
indices yet. That needs to...
Changing the Browse Indexes (3/4)
• Once you see your new index choice, you will note that it
is rather user unfriendly:
x...
Changing the Browse Indexes (4/4)
• Here are the message keys required to render the new browse
field nicely:
• xmlui.Arti...
Additional Configurations
• Statistics
• Media Filters
• Curation Tasks
Statistics - Outline
• Repository Overview
• Handle level
• Administrative view
• Google analytics
Statistics –
Repository Overview
• Usage – Top ten items
• Search – Top ten terms and click through
• Workflow – Total act...
Statistics – Handles
(i.e. Items and Collections)
• Usage – Total views as well as monthly views for the past 7 months
• S...
Statistics – Administrative
View
• Requires a few command-line tasks to activate
• stat-initial
• stat-report-initial
• st...
Statistics –
Google Analytics
• An alternative approach to tracking your repository
• Requires a Google Analytics account,...
Configuring Media Filters
• Open up dspace.cfg
• Find the “Media Filter / Format Filter” portion.
Configuring Media Filters
• There are several setting groups of note:
• filter.plugins activaties the filters with their h...
Configuring Curation Tasks
• A relatively simple way to implement and apply custom
operations to repository objects.
• Act...
DSpace’s Helper Servers
• The Handle Server
• Apache Solr - a webservice
using Apache Lucene
• SWORD – Simple Webservice
O...
DSpace’s Helper Servers
• All except the handle server exist as webapps under Tomcat
• The handle server is started separa...
The Handle Server
• Run [dspace-installation-dir]/bin/dspace make-handle-config
[dspace-installation-dir]/handle-server to...
The Handle Server
• Edit the [dspace-installation-dir]/handle-
server/config.dct file and inside the “server_config”
block...
The Solr Server
• Service underpinning the Discovery aspect, Statistics aspect,
and OAI service
• Not visible to anyone ex...
The Solr Server
• Solr configuration is set up for you automatically on a fresh
install, but you might want to edit it aft...
The SWORD Server(s)
• SWORD v1 configured in [dspace-install-dir]modulessword-
server.cfg
• SWORD v2 configured in [dspace...
The OAI Server
• Relies on Solr in DSpace 4.2
• Initialize with [dspace-installation-
dir]/bin/dspace oai import
• Dissemi...
The OAI Server
• Crosswalks (and contexts) can be added (or removed) by editing
[dspace-install-
dir]/config/crosswalks/oa...
The REST API
• Read-only access for the Anonymous user
• Simply provides JSON responses statelessly, and is not
configurab...
Final Thoughts on
Configuration
• Everything requires a restart, which can be a pain in production
environments requiring ...
Upcoming SlideShare
Loading in …5
×

DSpace 4.2 Basics & Configuration

6,592 views

Published on

Published in: Technology
  • Be the first to comment

DSpace 4.2 Basics & Configuration

  1. 1. DSpace 4.2 Advanced Training DSpace 4.2 Advanced Training by James Creel is licensed under a Creative Commons Attribution 4.0 International License. Special thanks to the DuraSpace Foundation and the Texas Digital Library for making this course possible.
  2. 2. Course Outline • Day 1: Quick Review of DSpace Basics DSpace Configuration • Day 2: DSpace Content Transmission • Day 3: Theming for the XMLUI Preview of DSpace 5.0
  3. 3. Why XMLUI? • Customizable, Modular, Extensible • Themes • Aspects • Specific features • Harvesting • Controlled vocabulary
  4. 4. What goes into a DSpace installation? Source Code Installer Installation Apache Tomcat Webserver mvn package ant build Webapps Webapps
  5. 5. What goes into a DSpace installation (1/4) – Source Code • Get it on sourceforge (zip) or GitHub • Where does the source code live? • C:Developmentdspace-4.2-src-release • Call this [dspace-src-dir]
  6. 6. What goes into a DSpace installation (2/4) - Installer • Using mvn package from the [dspace-src-dir] we create the “installer” at [dspace-src-dir]dspacetargetdspace-4.2-build. • The build directory will contain a build.xml for use with the Apache ant tool. • Once packaged, this installer can be moved anywhere on the filesystem and invoked. A bug in some JDKs on Windows will cause ant to crash if the directory is too far nested, so you might need to move it up a bit in the filesystem! • Apache ant will read a dspace.cfg file to determine where to install DSpace and how to connect to the database among other details. By default, it will use the configuration in its own config directory, but this can be overridden with the –Dconfig= parameter. • Configuration may also be customized by editing the build.properties file.
  7. 7. What goes into a DSpace installation (3/4) - Installation • From the installer directory, we can do a new installation with ant –Dconfig=[path-to-dspace.cfg] fresh_install which will create an empty DSpace database and overwrite all the configuration with what was specified • We can update the code (leaving the database and configuration untouched) with ant –Dconfig=[path-to-dspace.cfg] update • DSpace will be installed in the directory specified by your chosen config file, in our case • C:dspace • Call this [dspace-installation-dir]
  8. 8. What goes into a DSpace installation (4/4) - Webapps The webapps under [dspace-installation-dir]webapps should then be deployed under Tomcat. I usually do this with a symlink or by copying them over.
  9. 9. Structuring the Repository Content • Communities can contain sub-communities and collections • Collections contain only items • Items contain only bundles • Bundles contain only bitstreams • Bitstreams are the objects of ultimate interest
  10. 10. Structuring the Repository Items available at 3rd tier Communities and collections Top level community or communities Comm- unity Collec- tion Item Bundle Bitstream Bitstream Bundle Bitstream Item Bundle Bitstream Sub- Comm- unity Collec- tion… Sub- Comm- unity…
  11. 11. Some preliminaries • Log in as admin@admin.com, password “admin” • From the homepage, create the first top-level community • From the community’s page, create the first collection • Therein, submit the first item. I used the text file sample.txt as the bitstream.
  12. 12. Accessing Content via XMLUI • Communities, collections, and items are addressable by handles. • Bitstreams are addressable by their item’s handle plus a filename or sequence number (note that a filename will override a sequence number) • Bundles are not directly addressable
  13. 13. Let’s view the item • Click through the community/collection hierarchy to find the item • It is addressed via handle, for example: http://localhost:8080/xmlui/handle/123456789 /3 • Consider a singular file within one of the item’s bundles: http://localhost:8080/xmlui/bitstream/handle/123456789/3/ sample.txt?sequence=1 • Note that http://localhost:8080/xmlui/bitstream/handle/123456789/3/ sample.txt works just as well, as does http://localhost:8080/xmlui/bitstream/handle/123456789/3/ ?sequence=1
  14. 14. Some basic functionality for review • Moving and mapping items • Editing metadata individually and in batch • Managing epersons and groups • Permissions • Registries • Format • Metadata
  15. 15. Advanced DSpace 4.2 Training - Configuration DSpace 4.2 Advanced Training by James Creel is licensed under a Creative Commons Attribution 4.0 International License. Special thanks to the DuraSpace Foundation and the Texas Digital Library for making this course possible.
  16. 16. Configuring DSpace 4.2 and the XMLUI • Themes and Aspects • Emails • Authentication • Item Submission • Affordances for Information Seeking (i.e. exposing information) • Configuring Statistics • Configuring Media Filters • Configuring Curation Tasks • All the helper servers • Handle server • Solr • SWORD • OAI • REST
  17. 17. Themes and Aspects • Themes and aspects are applied in the dspaceconfigxmlui.xconf configuration file. • The root element, <xmlui> contains two sub-elements, <aspects> and <themes>.
  18. 18. Theming for the XMLUI • Inside the xmlui webapp, you can find a themes directory containing the themes. • Each theme is configured in its constituent sitemap.xmap file. • As with other configuration changes, the server must be restarted to enact changes to how themes are applied. • However, a currently applied theme may be edited on-the-fly.
  19. 19. Theming for the XMLUI • Within your xmlui.xconf file, have a look at the <themes> section. • Use the handle attribute to apply to specific handles in your repo. • Use the regex attribute to apply to pages based on their url path. • Note that themes listed first will take precedence. • Apply another theme to your favorite collection (look into your xmlui/themes directory for a few options) by using a handle. • Apply another theme to the community-list page using a regular expression. • Restart tomcat to see the effects.
  20. 20. Theming for the XMLUI - Internationalization • Within the xmlui/i18n directory, you will find the messages.xml file. • As part of the processing pipeline that generates XMLUI pages, keys are replaced by locally human-readable text as specified in the messages. • This is also very important for branding your repository, as you might not want it always referred to as “DSpace”. Brand your repository appropriately if desired.
  21. 21. Aspects in the XMLUI • Within your xmlui.xconf file, have a look at the <aspects> section. • Aspects are Java code that interacts with DSpace entities and contribute to the DRI (Digital Repository Interface) XML that is subsequently styled by the applied theme. • Uncomment the Versioning aspect and restart Tomcat to see this new functionality on the sample item’s page.
  22. 22. System Emails (1/4) - Configure DSpace to send email • Open jEdit and bring up dspace.cfg • Set the following values: • mail.server=gator3159.hostgator.com • mail.server.username = dspacetraining@brazosdatatech.com • mail.server.password = SuperSecretP@55w0rd • Uncomment the 3 lines of mail.extraproperties
  23. 23. System Emails (2/4) - Check that email works • DSpace provides a nice command line facility to check email functionality without having to go through any real workflows. • From the command line, run the test-email script.
  24. 24. System Emails (3/4) - DSpace email templates • Once email is working, customizing email texts is one of the easiest customizations one can do. • Simply find the email template of interest in C:dspaceconfigemails • Changes to these files do not require a restart of Tomcat • The significance of the numbers in braces depends on the calling Java class, and can be determined from the context.
  25. 25. System Emails (4/4) – Try out an email template modification • Make some modifications to the register email template. • Navigate to the landing page, and click register on the right- hand menu (or on the login page) • Put in an email address you can get to over the web – if you need one, ask me to get temporarily set up. • Check your email to see whether your changes were effective.
  26. 26. Authentication - Introduction • Configuring the Authentication Stack • IP-based authentication • Institutional authentication with attributes
  27. 27. Configuring IP-based authentication (1/2) • Create a new eperson group for your DSpace repository. • Pick an item and set its bitstream’s permissions such that only members of the new group can read it. • Delete the existing policy • Add a new policy for the group
  28. 28. Configuring IP-based authentication (2/2) • Bring up [dspace-install- dir]configmodulesauthentication.cfg • Find the stack of authentication methods • Add org.dspace.authenticate.IPAuthentication • Bring up [dspace-install- dir]configmodulesauthentication-ip.cfg • Add your ip to a new ip.GROUPNAME line. • What is your ip? Due to the vagaries of networking, this is best determined by checking [dspace-install- dir]logdspace.log.[date] for today’s date. • Restart tomcat, ensure you’re logged out, and see if you can read the item. • Pro Tip: If you’re behind a load balancer or another proxying server, set useProxies = true in dspace.cfg and make sure your sysadmin forwards the ip headers
  29. 29. Configuring Attribute-based Authentication • LDAP and Shibboleth are two institutional authentication schemes that can be used to convey headers full of attributes to DSpace. • Enable in XMLUI by adding their module’s classes to the auth stack. • Both employ an autoregister field in their configurations that permits users to be registered as epersons automatically upon logging in with the institutional authenticator.
  30. 30. Configuring Attribute-based Authentication – LDAP attributes • Configure with the file [dspace-install- dir]configmodulesauthentication-ldap.cfg • Denote your institution’s LDAP server, as with for ex. provider_url = ldap://ldap.myu.edu/o=myu.edu • Authentication occurs based on an object context, as with object_context = ou=people, o=myu.edu • When autoregistering the eperson, • eperson metadata (username, email, phone, etc) are populated from specified LDAP values, for ex. id_field = uid email_field = mail • Metadata are looked up based on a search context, as with search_context = ou=people, o=myu.edu • Check with your LDAP provider for your local configuration.
  31. 31. Configuring Attribute-based Authentication – LDAP Group Mapping • All LDAP authenticated users can be put into a specific group with the configuration value login.specialgroup = group-name • Specific group mappings on the basis of LDAP DNs (distinguished names) are achieved with the login.groupmap.[n] (for n > 0) values. These values are DN substrings followed by a colon and a DSpace group, for ex. login.groupmap.1 = ou=Students:students
  32. 32. Configuring Attribute-based Authentication – Shibboleth Attributes • Configure with authentication-shibboleth.cfg • Two types of sessions: lazy and active • Lazy sessions may be required by the application when needed, so are suitable for DSpace instances where many pages should be public • Active sessions restrict domains entirely and so run counter to open access • The lazy session login URL may be specified as with authentication.shib.lazysession.loginurl = /Shibboleth.sso/Login • eperson attributes may be populated specifically in the case of netid, surname, given name, and email, and en masse with a list. These mappings are achieved as with, for ex. email-header = SHIB-MAIL
  33. 33. Configuring Attribute-based Authentication – Shibboleth Role- based Group Mapping • Shibboleth attributes may be scoped by means of the at-sign ‘@’ as in attribute@scope • In order to map epersons into groups based on Shibboleth attributes, first determine how to deal with the scoping of the attributes. • Optionally, one may ignore attributes (use scope only) or ignore scope and just use the attribute • In addition, one must specify which header to examine for the role attributes, e.g. role-header = SHIB_SCOPED_AFFILIATION • Finally , epersons can be assigned to groups based on attributes, e.g. role.faculty = Faculty, Member
  34. 34. Final Thoughts on Authentication • Centralized methods require coordination with Campus or Department IT. • Groups are determined dynamically based on the headers for each session and not stored in the database. This can be desirable as people’s roles change at an institution.
  35. 35. Submission Workflows - Topics • Creating new submission workflows • Applying customizations to specific collections • Creating custom input forms
  36. 36. Submission Workflow (1/5) – modifying the workflow • Use jEdit to open the file C:dspaceconfigitem- submission.xml • There are 3 crucial sections to the file: • <submission-map> Maps workflows to collections • <step-definitions> If you want to re-use specific submission workflow steps, describe their headings and java classes here • <submission-definitions> Describes workflows in terms of their individual steps
  37. 37. Submission Workflow (2/5) – modifying the workflow • Consider the <submission-definitions> section at the bottom. • The default order for steps is familiar: Select Collection -> Initial Questions -> Describe -> Upload -> Verify -> License -> Complete • We can easily duplicate the traditional workflow, rename it, modify it, and apply it.
  38. 38. Submission Workflow (3/5) – modifying the workflow • Copy the traditional submission process in order to make a new <submission-process> with a name of your choosing. • Move the upload step to be step 2. • Change the default <name-map> in the <submission- map> at the top to use the new submission process. • Restart Tomcat. • Try out the new workflow.
  39. 39. Submission Workflow (4/5)- Customizing the submission form for a specific collection • Obtain the handle of a non-harvested collection. • Restore the default to traditional submission mapping. • Make a new mapping from your chosen collection to the modified workflow. • Restart tomcat once more, and make sure the modified workflow properly applies
  40. 40. Submission Workflow (5/5) – Enabling Advanced Embargo • Enabling this feature requires the additional step of setting webui.submission.restrictstep.enableAdvanc edForm=true in the dspace.cfg • Visit the item-submission.xml file again, comment out the UploadStep, and uncomment the ManageAccess and UploadWithEmbargo steps in the new workflow. • Restart tomcat, and visit the item submission again to see the new steps.
  41. 41. Creating Custom Input Forms (1/5) • Use jEdit to open the file C:dspaceconfiginput- forms.xml • Similarly to the item-submission.xml file, you will find a <form-map> section and a <form-definitions> section. • Within the <form-definitions> are <form> elements which characterize the metadata input.
  42. 42. Creating Custom Input Forms (2/5) • DSpace 4.2 has a traditional form with exactly 2 pages (recall that there are two pages of strictly metadata-oriented input in the traditional submission workflow. • Let’s duplicate the form element and give it a new name. • Then, let’s map the new form(s) to the same collection we did for our un-traditional submission workflow.
  43. 43. Creating Custom Input Forms (3/5) • The pages can be modified in a number of ways – • Target metadata field (schema, element, qualifier) • Labels • Repeatability • Input-type • Requirement • In addition, we can customize the value pairs used for dropdowns (that’s one of the input types).
  44. 44. Creating Custom Input Forms (4/5) • Let’s create a dropdown for dc.format (or another field if you like) • First, make a <value-pairs> with an appropriate value- pairs-name attribute and a specific dc-term attribute element inside the <form-value-pairs> element. • Next, make some <pair> elements inside the new value- pairs element. • Finally, make a new <field> element inside of the <page number=“1”> element to use the new value list. • Restart tomcat and check it out!
  45. 45. Custom Input Forms (5/5) - Customizing the Submission Workflow for Item Types • The element <type-bind>, containing comma-delimited dc.type values, may be contained in a <field> element. • This can only affect the workflow after the dc.type value has been set (so, typically, on the second page of metadata entry) • Try adding <type-bind> for some appropriate types such as “article” to the dc.description.abstract field on the second page of the input forms. • Restart Tomcat to see the effects.
  46. 46. Controlled Vocabularies for Input Fields (1/3) • With controlled vocabularies in DSpace, submitters can choose specific string values that are specified in an XML document, eliminating the ambiguity of free-text fields (at least for anyone who comprehends the semantics of the contents of the XML document) • Let’s fabricate a controlled vocabulary for the dc.subject.classification field which comes in the default registry.
  47. 47. Controlled Vocabularies for Input Fields (2/3) • Here is the grammar for the XML files: • The root element is a <node> element • It has an id and a label attribute. • It may contain an <isComposedBy> element. • An <isComposedBy> element can contain one or more <node> elements. • These xml files belong in the [dspace-installation- dir]configcontrolled-vocabularies directory • The following vocabularies are already provided: • nsi.xml - The Norwegian Science Index • srsc.xml - Swedish Research Subject Categories
  48. 48. Controlled Vocabularies for Input Fields (3/3) • Make an XML file with, say, 3 <node> elements and save it to the controller-vocabularies directory. • Add a new field for dc.subject.classification to the 2nd page of one of your metadata submission input forms. • Include the additional element <vocabulary closed=“true”>[name]</vocabulary> where name is what you named your xml file (minus the xml extension). • Restart Tomcat and check out your new vocabulary!
  49. 49. Enabling Information Seeking - Outline • Faceted search (aka “Discovery”) • Handles • DOIs • Deprecated Legacy Search/Browse
  50. 50. Discovery (i.e. facets) • A popular paradigm for information-seeking tasks. • The user selects a value for a metadata field. This results in a smaller space of items to search. • The user may select a value for another metadata field value that is still available in the new search space, resulting in a still smaller search space.
  51. 51. Configuring Discovery (1/7) • Configuration for the discovery feature may be found in [dspace-install- dir]configspringapidiscovery.xml • Herein, one may make bean definitions for the Java Spring framework, a means of configuring Java class instances for enterprise applications. • The beans have • id attributes that are used to reference them by name in Java application code, and • class attributes that name the Java class that the bean instantiates and configures
  52. 52. Configuring Discovery (2/7) • By analogy with the configuration of input forms and workflows which are mapped to specific handles (communites and collections), the discovery.xml file includes a bean that maps discovery configurations to handles. • This mapping bean has (eliding prefixes) both an id and a class of DiscoveryConfigurationService • Find the entry elements inside the map sub-element of this bean. • The entry elements will contain a key attribute referencing the target handle and a value-ref attribute referencing the discovery configuration to use.
  53. 53. Configuring Discovery (3/7) There are two privileged key attribute values that are not handles used in the entry elements: And the optional key=“site” entry element which solely impacts the default search page and the repository homepage
  54. 54. Configuring Discovery (4/7) • The discovery.xml file comes stock with a bean of id defaultConfiguration which is used by the default mapping. • The most expedient means of reconfiguring the facets for a certain community or collection is to copy this bean , rename (i.e. change the id) of it, and add the mapping to the DiscoveryConfigurationService near the top of the file. • Next, we will try adding a field to our new configuration.
  55. 55. Configuring Discovery (5/7) • Under the comment we find the beans representing our facets. • Within the org.dspace.discovery.configuration package there are the classes • DiscoverySearchFilter - used for facets that appear only in the dropdown on the Discovery search page • DiscoverySearchFilterFacet – used for facets that may appear in both the clickable Discovery sidebar widget and the Discovery search dropdown • HierarchicalSidebarFacetConfiguration – used for facets that are compatible with hierarchically expressed metadata field values such as those enabled by the controlled vocabulary mechainsm. • NOTE: If you want a facet to appear on the sidebar, it must also appear in the Discovery search dropdown!
  56. 56. Configuring Discovery (6/7) • Within a given Discovery configuration bean, there are two crucial lists: • It is by adding and removing references to the search filter beans that we can effect changes to the sidebar facet and Discovery search dropdown facet lists respectively.
  57. 57. Configuring Discovery (7/7) • As a simple test of Discovery configuration, we can take a freshly duplicated bean of the defaultConfiguration bean (first changing the id) • Then, duplicate the searchFilterAuthor bean among the search filter configuration beans • Change its id and targeted Dublin Core field(s) • Change the indexFieldName property • This one is a DiscoverySearchFilterFacet instance, so it is able to appear on both the sidebar and search page. • Add this new search filter configuration bean to the lists in the new configuration bean
  58. 58. Indexing for Discovery • A Solr webapp is provided in Tomcat’s webapps directory. See that Tomcat is running it at http://localhost:8080/solr • Solr will offer to a client on localhost a webform to query its discovery index • From the command line, run [dspace-install- dir]bindspace index-discovery • Add –b to the end of that if you get a java-related error about binary formats • Add -b and –f if your facets stubbornly fail to appear on the sidebar • Add –h to get the help text about all the options
  59. 59. Try out some faceted search • Restart and bring back up your repository. • Your Discovery facets will appear on the right – but, as usual, work might remain in the messages!
  60. 60. Naturally, if it works on the sidebar, it works in the search interface.
  61. 61. Final Considerations on Discovery • For customizing the search/browse affordances on a collection basis, a far more lightweight tool than XMLUI themes. • Hierarchical fields enable the facet values to be cut down to the final leaf by mean of a delimiter (the double colon “::” in the case of the default controlled vocabulary configuration) – Thus, for example, Everything::Entity appears as Entity under the subject facet. • SearchArtifacts is deprecated and disabled by default in 4.1. This will be a transition worth the costs.
  62. 62. Handles • Once you have a handle prefix assigned by CNRI (as described in the Server Configuration section below), set your handle prefix in dspace.cfg under the parameter handle.prefix
  63. 63. Digital Object Identifiers (DOIs) in DSpace • May be used in parallel with the handle functionality that is inextricable from DSpace (and will continue to be used for item’s URLs) • Requires registration with an agency, of which there are many. • Available Java Beans in DSpace implement for the DataCite and EZID services – for others, custom code is required. • No facility exists for multiple DSpaces (or other repository instances) to share a prefix and namespace separator and coordinate DOI mintage – but modify the namespace separator between DSpace instances and you will be good to go.
  64. 64. Digital Object Identifiers (DOIs) with DataCite • Lots to configure - assigned prefix, credentials, and update event handling in dspace.cfg • Change value of variable ‘publisher’ in [dspace-install- dir]/config/crosswalks/DIM2DataCite.xsl • Uncomment beans in [dspace-install- dir]/config/spring/api/identifier- service.xml • Set up cron jobs to talk to the DataCite API • See the docs for details if this is your use case.
  65. 65. Digital Object Identifiers (DOIs) with EZID • Less to configure – uncomment the EZID block in dspace.cfg • Uncomment beans in [dspace-install- dir]/config/spring/api/identifier- service.xml • Optionally configure the metadata mapping on the bean • See the docs for details if this is your use case.
  66. 66. Activating Legacy Search/Browse • Open the file C:dspaceconfigxmlui.xconf • Comment out the Discovery aspect and uncomment the Search Artifacts aspect • In dspace.cfg, do the following: • Remove discovery to the list of event.dispatcher.default.consumers • Change recent.submissions.count to five or so • Enable the ItemCountDAO, BrowseDAO, BrowseCreateDAO classes for Postgres (instead of Solr)
  67. 67. Changing the Browse Indexes (1/4) • Return to the dspace.cfg file. • Track down where the webui.browse.index.n entries begin • We can here add and remove indexes – I chose to add dc.format: webui.browse.index.5 = format:metadata:dc.format:text
  68. 68. Changing the Browse Indexes (2/4) • As it happens, we haven’t initialized our browse and browse indices yet. That needs to be done at the command line with: C:dspacebindspace index-db-browse –f C:dspacebindspace index-lucene-init -f • Restart Tomcat • The new index should show up – if not, click around to another page, or try deleting Tomcat’s cache (it is the C:DevelopmenttomcatworkCatalinalocalh ostxmlui directory). Caching issues are known to exist on the server…
  69. 69. Changing the Browse Indexes (3/4) • Once you see your new index choice, you will note that it is rather user unfriendly: xmlui.ArtifactBrowser.Navigation.browse_format
  70. 70. Changing the Browse Indexes (4/4) • Here are the message keys required to render the new browse field nicely: • xmlui.ArtifactBrowser.Navigation.browse_format (on the sidebar) • xmlui.ArtifactBrowser.ConfigurableBrowse.title. metadata.format (atop the browse-by controls) • xmlui.ArtifactBrowser.ConfigurableBrowse.format .column_heading (atop the table of results) • xmlui.ArtifactBrowser.ConfigurableBrowse.trail. metadata.format (in the breadcrumb trail) • Provide some reasonable interpretations for these keys in the messages. • Ctrl+C Tomcat, delete its cache, and restart to see the improvements.
  71. 71. Additional Configurations • Statistics • Media Filters • Curation Tasks
  72. 72. Statistics - Outline • Repository Overview • Handle level • Administrative view • Google analytics
  73. 73. Statistics – Repository Overview • Usage – Top ten items • Search – Top ten terms and click through • Workflow – Total actions performed
  74. 74. Statistics – Handles (i.e. Items and Collections) • Usage – Total views as well as monthly views for the past 7 months • Search – Top ten terms and click through • Workflow – Total actions performed • Accessible on the sidebar when viewing any DSpace object • Visibility of these pages can be restricted to administrators in the [dspace-install-dir]configmodulesusage- statistics.cfg file
  75. 75. Statistics – Administrative View • Requires a few command-line tasks to activate • stat-initial • stat-report-initial • stat-general • stat-report-general • stat-monthly • stat-report-monthly • Logged in as administrator, look to Administrative => Statistics on the sidebar
  76. 76. Statistics – Google Analytics • An alternative approach to tracking your repository • Requires a Google Analytics account, which in turn provides you a tracking key for your domain • The key and associated JavaScript must then be put on your website – DSpace facilitates this in the XMLUI with the configuration value xmlui.google.analytics.key=UA-XXXXXX-X
  77. 77. Configuring Media Filters • Open up dspace.cfg • Find the “Media Filter / Format Filter” portion.
  78. 78. Configuring Media Filters • There are several setting groups of note: • filter.plugins activaties the filters with their human- readable names • plugin.named.org.dspace.app.mediafilter.Form atFilter assigns human-readable names to the Java classes implementing the filters • filter.org.dspace.app.mediafilter.X.inputfor mats designates the formats to which filter X applies
  79. 79. Configuring Curation Tasks • A relatively simple way to implement and apply custom operations to repository objects. • Activate with the plugin.named.org.dspace.curate.CurationTask setting, which will consist of key-value pairs such as • org.dspace.curate.ProfileFormats = profileformats • Where the key is the Java class and the value is the designation used for further configuration
  80. 80. DSpace’s Helper Servers • The Handle Server • Apache Solr - a webservice using Apache Lucene • SWORD – Simple Webservice Offering Repository Deposit • Open Archives Initiative • REST API – Representational State Transfer Application Programming Interface.
  81. 81. DSpace’s Helper Servers • All except the handle server exist as webapps under Tomcat • The handle server is started separately from the command line.
  82. 82. The Handle Server • Run [dspace-installation-dir]/bin/dspace make-handle-config [dspace-installation-dir]/handle-server to do the initial setup. • Next, register your institution with CNRI to get your repository prefix: http://www.handle.net/service_agreement.html , agree to pay some money, and upload the generated sitebndl.zip file. • Wait a week…
  83. 83. The Handle Server • Edit the [dspace-installation-dir]/handle- server/config.dct file and inside the “server_config” block set the pairs "storage_type" = "CUSTOM" and "storage_class" = "org.dspace.handle.HandlePlugin" and change YOUR_NAMING_AUTHORITY to the handle prefix you get from CNRI • The server can be started with [dspace-installation- dir]/bin/start-handle-server. Make sure port 2641 is open through your firewall. • If you’ve been running DSpace for some time without real handles, you can migrate the previously generated ones with the script update-handle-prefix
  84. 84. The Solr Server • Service underpinning the Discovery aspect, Statistics aspect, and OAI service • Not visible to anyone except the localhost since queries upon it can reveal restricted items. • One can check functionality quickly however, with a wget at the localhost’s command line or by seeing if the repository statistics page shows any content whatsoever.
  85. 85. The Solr Server • Solr configuration is set up for you automatically on a fresh install, but you might want to edit it after the fact if you keep the services elsewhere on the network or want to tweak the indexing and searching functionality. • The three services (oai, search, and statistics) are denoted in [dspace-install-dir]/solr/solr.xml • Each is configured in its own conf directory, as with [dspace-install-dir]/solr/statistics/conf
  86. 86. The SWORD Server(s) • SWORD v1 configured in [dspace-install-dir]modulessword- server.cfg • SWORD v2 configured in [dspace-install- dir]modulesswordv2-server.cfg • Noteworthy configuration values – • URLs for servicedocument and deposit • Which types of packages to accept
  87. 87. The OAI Server • Relies on Solr in DSpace 4.2 • Initialize with [dspace-installation- dir]/bin/dspace oai import • Disseminates metadata with XSL configurable crosswalks
  88. 88. The OAI Server • Crosswalks (and contexts) can be added (or removed) by editing [dspace-install- dir]/config/crosswalks/oai/xoai.xml • The context determines the end of the URL path, for example “request” is the context of http://localhost:8080/oai/request • Once a new XSL file is in place, it can be denoted in the <Formats> section with a unique id attribute and invoked as a <Format> with the corresponding refid attribute in the appropriate <Context> • Just to see an example of how to add a new format, you can copy and rename an existing one and activate it in xoai.xml
  89. 89. The REST API • Read-only access for the Anonymous user • Simply provides JSON responses statelessly, and is not configurable short of changes to how Tomcat serves it • Changes or additions to the JSON responses would require some Java coding
  90. 90. Final Thoughts on Configuration • Everything requires a restart, which can be a pain in production environments requiring scheduling and after-hours work. • Wondering about whether and how something can be done? The DSpace Wiki is your friend: https://wiki.duraspace.org/display/DSDOC4x/DSpace+4.x+Documentation • If you lose functionality after an upgrade, one of the configuration files is probably to blame. An upgrade requires a comparison of the new stock configurations with your old ones (line by line!). Specific issues can be addressed by checking the docs regarding the feature in question. • [dspace-install-dir]/log/dspace.log.[today’s date] is a great place to start looking if you’re stumped about your repository not acting as expected.

×