2. Where Dspace stores data
/dspace/assetstore directory will have all the
− Bitstreams and licenses
PostgreSQL databases contains information on
− Metadata
− Information about Communities
− Information about Collections
− Information about e-groups & authorizations
− Information about E-persons & authorizations
− Host of other information
3. Export/Import in Dspace
Export and import deal only with bitstreams,
metadata, license and handles.
But NOT information about communities,
collection, members, reviewers etc., access
permissions/restrictions
You can export or Import
− An item or
− All items in a collection
4. Export command syntax
/dspace/bin/dsrun org.dspace.app.itemexport.ItemExport
--type=COLLECTION --id=collID
--dest=dest_dir --number=seq_num
Where
--type can have either the value COLLECTION or ITEM
--id is the handle/collection_or_Item_Id ex: 1849/2
(or 123456789/2 in case you do not have handle)
--dest is destination directory
(directory be created prior before running the script)
--number is sequence number, it can be just 1
5. Shell Script for exporting
#!/bin/sh
if test $# != 1
then
echo "Usage: $0 <export-directoryname>"
exit
fi
declare collection_id[5]=(2 3 4 5 6 7)
for((i=0; i<=5; i++))
do
mkdir $1/${collection_id[$i]}
/dspace/bin/dsrun
org.dspace.app.itemexport.ItemExport
--type=COLLECTION
--id=1849/${collection_id[$i]}
--dest=$1/${collection_id[$i]}
--number=1
done
6. In the shell script...
Look for the line
declare collection_id[5]=(2 3 4 5 6 7)
Change 2 3 4 etc with your collection ids
Clue: collection ids are the one that appear in the
browser URL after handle prefix, ie. If you have not
registered with CNRI, the number that appears after
123456789/
Also create the directory where the data should be
exported to
7. Shell Script for Import
#!/bin/sh
declare collection_id[5]=(2 3 4 5 6 7)
for((i=0; i<=5; i++))
do
/dspace/bin/dsrun
org.dspace.app.itemimport.ItemImport
-a -e dspace@localhost.localdomain
-c 123456789/${collection_id[$i]}
-s $1/${collection_id[$i]}
-m mapfile
done
8.
Here also change the collection ids in the import
progam
-e option, should have the dspace admin id (i.e. e-
mail address)
9. What is exported
The following files will be created for every item
− dublin_core.xml ( metadata)
− Handle ( one line having the handle number)
− license.txt
− Actual file ( bitstream: could be pdf or doc or an
image file)
− Contents (with two lines – license file name, and
actual bitstream name)
10. However
Import and Export are meant for data exchange
It can however, be used for partial back up
It takes care of only items
It does not back up
− Your communities, collection, e-groups, e-persons
11. How to backup postgresql
pg_dump as dspace user
Example:
$ pg_dump dspace > backupfile
Note: where dspace is name of the database
backup file will have all the table definitions and
contents.
pg_dump has lots of options
12. How to restore database
psql -d dspace –f dumpedfile
Note: pgsql has lots of options, to know more
about options, you can use
13. Alternative (using tar)
To dump a database called mydb that contains
large objects to a tar file:
$ pg_dump -Ft -b mydb > db.tar
To reload this database (with large objects)
to an existing database called newdb:
$ pg_restore -d newdb db.tar
14. Upgrading
This procedure should be first step when you are
upgrading DSpace to newer version
Even if upgradation fails, you have back to fall
back
15. Upgrading Tip
Have different database and as a different user, so
that you do not have to touch the existing DSpace
insallation
16. Extra care
It is a good idea to take a tape (hard disk) back up
of
− Entire /dspace directory
− pg_dump out put file
− And the export directory
17. Final Lesson
Learning dspace is too easy.
− can be learnt in a week
− Can be mastered in a month
Creating content is continuous, long-term,
perhaps no end
Be more careful with the Content