Queen's University QSpace Pilot Project
Upcoming SlideShare
Loading in...5
×
 

Queen's University QSpace Pilot Project

on

  • 1,521 views

 

Statistics

Views

Total Views
1,521
Views on SlideShare
1,521
Embed Views
0

Actions

Likes
0
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Queen's University QSpace Pilot Project Queen's University QSpace Pilot Project Document Transcript

  • Information Technology Services Dupuis Hall, Kingston, ON K7L 3N6 Queen’s University QSpace Pilot Project Technical Report Subject: QSpace Project Technical Report Number: Version 1.00 Issued by: ITServices, Gail Ferland, Project Tech Lead Date 11 Sep 05 Maximum Review Period 1 year
  • History Revision Chart Version Author(s) Description of Revision Date Completed 1.00 SR, GF Initial Draft 11 Sep 05 QSpace Pilot Project Final Technical Report.doc 1
  • Table of Contents 1.0 Forward ............................................................................................................................... 3 2.0 QSpace Life Cycle .............................................................................................................. 4 2.1 Pilot Project..................................................................................................................... 4 2.1.1 Installation............................................................................................................... 4 2.1.2 Customization ......................................................................................................... 4 2.2 Production Implementation............................................................................................. 4 2.2.1 General Hardware Planning Advice ....................................................................... 4 2.2.2 Proposed Hardware Solution .................................................................................. 5 2.2.3 Summary of Expected Hardware/ Licensing Expenses .......................................... 8 2.2.4 Production System Implementation Planning......................................................... 9 2.3 QSpace Operations & Maintenance................................................................................ 9 2.3.1 QSpace Administration........................................................................................... 9 2.3.2 QSpace Application Administration ..................................................................... 10 2.3.3 QSpace Application Updates ................................................................................ 10 2.3.4 QSpace Programming ........................................................................................... 10 2.3.5 QSpace User Interface Changes............................................................................ 10 2.3.6 Database Administration....................................................................................... 10 2.3.7 QSpace Support .................................................................................................... 11 2.3.8 QSpace Training ................................................................................................... 11 2.3.9 QSpace Tasks - Annual Work Estimates ............................................................. 12 2.3.10 Production Support and Maintenance Recommendation...................................... 12 3.0 Summary of Recommendations........................................................................................ 13 Appendix A – Hardware Comparison Table ................................................................................ 14 Appendix B – Hardware Selection Guiding Principles ................................................................ 15 QSpace Pilot Project Final Technical Report.doc 2
  • 1.0 Forward QSpace is a Queen’s branded instance of DSpace, an open source Institutional Repository (IR). DSpace is implemented as a web application that runs within a Java application container. The Queen’s instance of DSpace (QSpace) is hosted in a Jakarta Tomcat Servlet/JSP application container. This document is meant to guide the evolution of the QSpace pilot project to a full production system. It will provide recommendations on hardware, an implementation plan, and ongoing support and maintenance. QSpace Pilot Project Final Technical Report.doc 3
  • 2.0 QSpace Life Cycle 2.1 Pilot Project The QSpace pilot project implemented a DSpace repository, enabling the QSpace Project Steering Group to develop a business plan for a sustainable repository. For the 16 Communities represented in the pilot there were 21 Collections accounting for 94 submissions. Submissions ranged from individual PDF documents, to entire web sites, to bundles of learning objects, for a total of 3340 electronic files. 98 EPeople were created in QSpace representing those individuals that registered to submit content to receive updates about content submissions. Disk space usage during pilot totalled 962.2 MB representing: Asset store (electronic submissions) : 212.2 MB Database (Metadata) 100.0 MB Full text search indices: 8.5 MB History Directory: 18.1 MB DSpace Logs Directory: 197.4 MB Tomcat Logs Directory: 200.0 MB Postgres db Logs Directory: 226.0 MB 2.1.1 Installation Stage Completed. Refer to document: D1_QSpace_Install.doc 2.1.2 Customization Stage Completed. Refer to document: D2_QSpace_Customization.doc 2.2 Production Implementation 2.2.1 General Hardware Planning Advice The following advice is from a posting to the DSpace-tech list by Robert Tansley, Digital Media Systems Programme, HP Labs. Robert Tansley presently serves as the DSpace software architect. This posting can be found here: http://sourceforge.net/mailarchive/message.php?msg_id=11094066 “…the dual processors and 1 or 2 GB RAM will probably be well enough for your pilot phase, and have enough "oomph" to give you some time and headroom when it comes to buying a beefier server. QSpace Pilot Project Final Technical Report.doc 4
  • …go for a rack-mounted version if you have racks you can use. This will give you a similar amount of power but will be more expandable storage- wise. So it depends how much storage you envisage needing: If it’s in the region of 1.4 Tb or less, a workstation will do, otherwise go for an entry- level server with internal storage to start with, which doesn’t cost much more and you can expand the storage later. In any case memory is always worth it, I’d say go for 2GB.” Lessons to take away from this advice: • 2 GB Ram • Rack mount server for expandability and flexibility • 2 processors Refer to Appendix B - Hardware Selection Guiding Principals. 2.2.2 Proposed Hardware Solution The proposed hardware solutions conform to the advice given by Robert Tansley. They also compare well to the benchmark hardware used by Glasgow University and the University of Toronto. Finally, as much as is possible within the constraints of cost the “Hardware Selection Guiding Principles” have been considered. QSpace Pilot Project Final Technical Report.doc 5
  • 2.2.2.1 Hardware Options Two options will be proposed. Both options are rack mountable servers (vs. desktops). They are both Sun products, in keeping with ITS hardware competencies. Table 1 – Hardware proposals for QSpace Description Processors RAM Internal External Price Disk DISK 1. Sun Fire V210 2x 64 bit 1.34 GHz 2GB 2x73GB n/a $6995 2. Sun Fire V240 2x 64 bit 1.5 GHz 2GB 2x146GB n/a $11,200 Option 1 - Least cost This is the least cost option worth considering. It delivers all that is needed in the short term, but is not particularly scalable. It can accommodate up to 2 internal disks, either 73 or 146GB. It is priced with 2 - 73GB disks to keep the cost of this option low. One expansion slot. This slot could be used to install an optional fibre channel card, for access to ITS Storage Area Network (SAN). Option 2 - Scalable This option has more flexibility, and as a result is more future-proof. It was designed for greater availability. It has redundant power supplies. It can accommodate up to 4 internal disks, either 73 or 146GB. It is priced here with 2 – 146 GB disks. However, two additional 146GB drives can be added for $2400. Four disks allow for internal storage up to 584GB. Three expansion slots. Two fibre channel cards would allow redundancy in the data path to the ITS SAN, leaving one slot available for other needs. Notes Both options allow up to 8GB of RAM. Both options have four Ethernet ports. If internal disk storage is insufficient now or in the future, both options offer an expansion slot that could accommodate a fibre-channel card used to connect to ITS Storage Area Network (SAN). QSpace Pilot Project Final Technical Report.doc 6
  • 2.2.2.2 Server Administration Program ITS provides a Dedicated Server Administration Plan (DSAP) that consists of: • ITS manages user accounts and file security • ITS will take care of operating system patches and virus detection software on the server • Managed backup • Standardized hardware allow for some extra redundancy • Servers located in a climate controlled, secure area With the Dedicated Plan your Department purchases a server, operating system software and client licenses as recommended by ITS. ITS installs and administers the server in our facilities. The service cost is $1200.00 per year. 2.2.2.3 Hardware Service Agreement Whether Option 1 or 2 is chosen, a hardware service agreement should be considered. Sun offers various levels of support and varying contract lengths. ITS contracts for Silver level support on most of its servers. Sun Fire V210 Warranty Upgrade to Silver Support - 3 Years CAN$ 2,808.00 Sun Fire V240 Warranty Upgrade to Silver Support - 3 Years CAN$ 3,096.00 2.2.2.4 Backup service ITS backup service is $25/GB per year. 2.2.2.5 Future Proofing If the internal disk storage proves to be insufficient, the ITS Storage Area Network (SAN) can be used. This service costs $50/GB per year which includes backup. Additional hardware is required before accessing the SAN. One or more fibre channel cards. Option 1 will only accommodate one card. Option 2 will accommodate two cards. A fibre channel card costs $1610. Cost of fibre will be in the range of $200. Access to fibre switch is a shared expense. On a 16 port switch Option 1 would consume 1/16 of the ports or 1/16 the cost of the switch. For Option two this fraction increases to 2/16. Hence as a fraction of the cost of a QSpace Pilot Project Final Technical Report.doc 7
  • $10,000 switch, Option 1 would equate to $625, and Option 2 would equate to $1250. 2.2.2.6 Test and Development Server The QSpace pilot is presently hosted on carrrick.its.queenus.ca. This is a modest Sun V100 with 1 GB RAM, 40 GB of disk. It is not suited to the expected production environment, however would serve as an excellent Test and Development server. This server and its service agreement are paid for. There are no further costs associated with maintaining this test/development server. 2.2.3 Summary of Expected Hardware/ Licensing Expenses This summary is based on an ITServices preferred hardware configuration (Option 2). It does not consider the use of the SAN. The needs of QSpace in the next three years are likely to be met with internal storage. Table 2 – Summary of Expected Expenses Description Year 1 Year 2 Year 3 Sun FireV240 $11,200* -- -- Sun Maintenance Silver Plan $3,096* -- -- Dedicated Server $1,200 $1,200 $1,200 Administration Plan Backup (estimate based on 10 $250 $250 $250 GB) Verisign SSL Certificate $100 $100 $100 Total $15,846 $1,550 $1,550 * Add applicable taxes. QSpace Pilot Project Final Technical Report.doc 8
  • 2.2.4 Production System Implementation Planning Table 3 – QSpace Production System Plan Assignment/Task Dependency/Limits Resources Time Acquire and install hardware • Physical space CCStore, ITS Sys 7 days • Networking & Ops Refer to section 2.2.5 Proposed • Ops scheduling for Hardware Solution installation. • Delivery – usual delivery is 4 weeks after order Install DSpace Sys Dev 15 days Capture Pilot code and Sys Dev 5 days customization changes (CVS) Reapply existing customization Sys Dev 10 days Migrate existing data Sys Dev 5 days Test backup and recovery Sys Dev, ITS Sys 5 days process & Ops Implement and document ITS Sys & Ops 2 days appropriate NOCOL system procedures for QSpace Complete Footprint issues on ITS Sys & Ops, 3 days hold – i.e. bulk upload Library Develop central authentication Sys Dev 12 days method for QSpace Training of Library Staff Sys Dev 1 day 2.3 QSpace Operations & Maintenance 2.3.1 QSpace Administration 2.3.1.1 Community (Top Level) Administration • Create top level Communities. • Customize top level Community splash pages. • Set top level user permissions including delegation to Collection Administrators. • Choose Community licence. • Setup Community workflow. 2.3.1.2 Collection (Delegated) Administration • Create Collections within delegated Community. • Customize Collection splash pages. • Optionally override Community licence with Collection specific licence. QSpace Pilot Project Final Technical Report.doc 9
  • 2.3.1.3 Workflow Administration (optional) • Accept or reject initial submissions • Accept or reject metadata 2.3.2 QSpace Application Administration • Application tuning. • Monitor application health. • Shutdown/start-up as needed. • Troubleshoot errors and warnings. • Security settings. • Certificate. • Logging. • System scripting. 2.3.3 QSpace Application Updates • Major application updates, once or twice annually. • Migrate programming customizations. • Migrate User Interface “Look and Feel” customizations (using CVS). • Migrate “Content” such as QSpace help files, QSpace About pages, and custom QSpace jsps. 2.3.4 QSpace Programming 2.3.4.1 Bulk Uploading Collections • XML encapsulation of metadata. • Perl and shell programming to collect and format XML records. 2.3.4.2 Custom Features • Productivity features as deemed necessary. 2.3.5 QSpace User Interface Changes • Major “Look and Feel” rewrite every 3 years. • As required “Content” changes to existing QSpace help files, QSpace About pages, and custom QSpace jsps. 2.3.6 Database Administration • Tune database. • Troubleshoot database errors and warnings • Troubleshoot and correct user input errors. QSpace Pilot Project Final Technical Report.doc 10
  • 2.3.7 QSpace Support 2.3.7.1 First Line Support • End user feature support. • Delegated administrator support. 2.3.7.2 Second Line Support • QSpace Administration support. • Workflow Administration support. 2.3.7.3 Third Line Support • Advanced and undocumented features including super user. • Application defects (bugs). • Database manipulation to recover from user input errors (and bugs). 2.3.8 QSpace Training 2.3.8.1 QSpace Administrator (and backup) • General Institutional Repository (IR) background and philosophy. • DSpace data model. • DSpace permission model. • Delegation to Collection administrators. • Workflow model. • DSpace admin interface. • Pointers to available DSpace resources, including lists, user groups, etc. 2.3.8.2 Delegated Administrators • General Institutional Repository (IR) background and philosophy. • DSpace data model. • DSpace permission model. • DSpace admin interface. • Pointers to available DSpace resources, including lists, user groups, etc. 2.3.8.3 Library Staff • QSpace purpose, goals, and access. • Guide and direct end users to QSpace. 2.3.8.4 End users • QSpace purpose, goals, and access. QSpace Pilot Project Final Technical Report.doc 11
  • 2.3.9 QSpace Tasks - Annual Work Estimates Table 4 – QSpace Tasks – Annual Work Estimates Task Description Min Max Weeks Weeks 2.1 QSpace Administration 2.1.1 Community (Top Level) Administration Library 2 6 2.1.2 Collection (Delegated) Administration Library 2 6 2.1.3 Workflow Administration (optional) Library (4) (16) 2.2 QSpace Application Administration Sys Dev 2 4 2.3 QSpace Application Updates Sys Dev 2 4 2.4 QSpace Programming 2.4.1 Bulk Uploading Collections Sys Dev 2 6 2.4.2 Custom Features Sys Dev 0 unlimited 2.5 QSpace User Interface Changes Sys Dev 2 8* 2.6 Database Administration Sys Dev 2 4 2.7 QSpace Support 2.7.1 First Line Support Library 4 8 2.7.2 Second Line Support Library 2 4 2.7.3 Third Line Support Sys Dev 2 3 2.8 QSpace Training 2.8.1 QSpace Administrator (and backup) Sys Dev 2 3 2.8.2 Delegated Administrators Library 3 4 2.8.3 Library Staff Library 1 2 2.8.4 End users Library 1** 2** *Major User Interface rewrite. **Broadcast means of training, print materials, FAQ, online documents. 2.3.10 Production Support and Maintenance Recommendation The table above summarizes those tasks associated with QSpace support and maintenance. It also identifies the time commitment of each task and where the responsibility for that task might lie. The tasks and time commitments allocated to the Sys Dev position could be supported by a 0.5 FTE position. We would recommend that a salary grade 8 would be required to complete the task in the estimated time lines. The cost estimate for a 0.5 FTE, Salary grade 8 would be $30,000 per year. QSpace Pilot Project Final Technical Report.doc 12
  • 3.0 Summary of Recommendations In short there are three recommendations. First there is recommended hardware, second is a recommended implementation plan, and third is a division of labour between Library Staff and a System Developer. Each will be outlined below: 1. Sun Fire V240, 2x1.5GB processors, 2GB RAM, 2x146GB internal storage, Sun 3 year Silver level support upgrade, ITS Dedicated Server Administration Plan, and ITS backup service. 2. Implementation plan to follow tasks described in Table 3 – QSpace Production System Plan. 3. A 0.5 FTE System Developer Salary Grade 8 is required to complete the implementation of QSpace and support ongoing technical maintenance and support. QSpace Pilot Project Final Technical Report.doc 13
  • Appendix A – Hardware Comparison Table This table has been provided for easy comparison of all table data presented in the hardware section of this document. Keep in mind that DSpace architect Robert Tansley suggests a system should as a minimum comprise: • 2 GB Ram • Rack mount server for expandability and flexibility • 2 processors All of the following solutions are rack mountable. Table A1 – Hardware Comparison Description Processors RAM Internal External Price Disk DISK DSpace FAQ Recommendations 1.a. HP – Entry 2x 64 bit 900Mhz 2GB 36GB 2TB $40,000 1.b. HP – Mid 2x 64 bit 900Mhz 2GB 36GB 4TB $485,000 1.c. HP – High 2x 64 bit 900Mhz 2GB 36GB 50TB $1,800,000 2. Sun Fire 280R 2x 64 bit 900Mhz 2GB 2x36GB 436GB $30,000 3. Dell 2650 2x 32bit 2.4Ghz 2GB 2x73GB 2.5 TB $10,000 University DSpace Benchmarks (Glasgow/Toronto) Sun Fire 2x 64 bit 900Mhz 4GB 36GB 432GB ? IBM P670 2x 64 bit 1.2 GHz 1 GB ? 100GB ? Queen’s Recommendations 1. Sun Fire V210 2x 64 bit 1.34 GHz 2GB 2x73GB n/a $6,995 2. Sun Fire V240 2x 64 bit 1.5 GHz 2GB 2x146GB n/a $11,200 QSpace Pilot Project Final Technical Report.doc 14
  • Appendix B – Hardware Selection Guiding Principles 1.0 High Availability A measure of how much time a network or a connection is running i.e. time running divided by time measured. Thus, if you measured something for 20 minutes and it was only up for 19 of them, you'd have 95% availability. High Availability is impacted by: • Reliability – the ability of an item to perform a required function under given conditions for a given time interval. • Resiliency - intelligence in network devices that ensures fast recovery around any device or link failure. • Redundancy - device and link redundancy, such as mirrored drives, dual network interfaces , dual power supplies, etc. • Robustness – the ability to continue to operate especially when under stress or when confronted with invalid input. • Performance - given that it works, how well does it work. Performance includes thru put and response time for both bursty and steady state traffic. • Scalability – the ability to augment components of an existing system to deal with increasing work load, rather than replacing the entire system. • Load balancing - allows a device to take advantage of multiple best paths to a given destination. • Hot swappable devices – allows replacement or addition of components such as disk without having to shut down. Network Design Well defined network topologies and configurations designed to ensure there is no single point of failure.Data Security Integrity, confidentiality, and availability (i.e. Denial of Service). Data Integrity Ensure that data is never lost. Maintainability Ease of repair and cost of repair. Cost QSpace Pilot Project Final Technical Report.doc 15
  • 2.0 Hardware Suggestions from DSpace FAQ The DSpace FAQ can be found here: http://www.dspace.org/faqs/ The following excerpt from the DSpace FAQ is dated, and should be used with caution. The hardware specs are for an older generation of hardware that is up to two years old. For the quoted price expect to get faster processors and more disk space in newer generations of hardware. Also keep in mind, that Hewlett Packard (HP) has heavily sponsored the DSpace project. “There are no specific server requirements for DSpace except UNIX. Because the application is written in Java, in theory it will run on other platforms as well. The system runs on anything from a laptop to a $500K server, but there are a few general recommendations for hardware architectures. For a research university, DSpace requires a reasonably good server (see below) and a decent amount of memory and disk storage.” Table 5 – DSpace FAQ Hardware recommendations Description Processors RAM Internal External Price Disk DISK 1.a. HP - Entry 2x 64 bit 900Mhz 2GB 36GB 2TB $40,000 1.b. HP - Mid 2x 64 bit 900Mhz 2GB 36GB 4TB $485,000 1.c. HP - High 2x 64 bit 900Mhz 2GB 36GB 50TB $1,800,000 2. SunFire 280R 2x 64 bit 900Mhz 2GB 2x36GB 436GB $30,000 3. Dell 2650 2x 32bit 2.4Ghz 2GB 2x73GB 2.5 TB $10,000 1. HP Server rx2600, powered by dual 64-bit Intel Itanium 2 processors (900MHz), 2GB RAM, 26 GB internal disk storage. HP StorageWorks Modular SAN Array 1000 (msa1000) with a single high-performance controller. Options include a second controller and, with the addition of two more drive enclosures, controls up to 42 Ultra2, Ultra 3, or Ultra320 SCSI drives. Total capacity can be six terabytes. Cost starts around $40K and goes up to around $1.8M. 2. SunFire 280R Server, two 900MHz UltraSPARC-III Cu processors, 8MB E-cache, 2GB memory, two 36GB 10,000rpm HH internal FCAL disk drives, DVD, 436-GB, or 12 x 26.4 Gbyte 10K RPM disks, Sun StorEdge A1000 rack mountable w/ 1 HW RAID controller, 24MB std cache. Around $30K. QSpace Pilot Project Final Technical Report.doc 16
  • 3. Dell PowerEdge 2650 with dual Xeon processors (2.4GHz), 2GB RAM, 2x73GB SCSI disks. One 2.5TB Apple XServe. A DLT tape library to back up the DB/jsps etc. Around $10K. 3.0 Benchmark Hardware Installations University of Glasgow, 19,500 students, 5000 staff. Table 6 – University of Glasgow hardware Description Processors RAM Internal External Price Disk DISK Sun Fire 2x 64 bit 900Mhz 4GB 36GB 432GB ? University of Toronto, 53000 students, 9600 staff. Table Table 7 – University of Toronto hardware Description Processors RAM Internal External Price Disk DISK IBM P670 2x 64 bit 1.2 GHz 1 GB ? 100GB ? QSpace Pilot Project Final Technical Report.doc 17