SlideShare a Scribd company logo
1 of 9
Download to read offline
2013
Virtua
appro
Societ
Diego F
Virtual
Founda
E-mail
Rosaly
Virtual
Founda
E-mail
Inês M.
Attributi
Abstra
Virtual L
latter ha
and Fed
(FAPES
Projects
addressi
achieved
scholars
and for
region a
FAPESP
with Op
Python,
and oth
which co
views in
some of
Keywor
Scientifi
al Librari
ach to dis
ty
Ferreira Uc
Library, Sã
ation (FAPE
address: di
Favero Kr
Library, Sã
ation (FAPE
address: ro
Copy
M. Imperatr
ion 3.0 Unpo
ct:
Libraries lac
ave a well-ho
dora. As a so
SP) has devel
s, Scholarsh
ing the need
d in funded
ships and pu
the Funding
and to assess
P Virtual Lib
en Source so
uses MySQL
er Python/D
onnects Dja
n 2012 and it
f them have a
rds: Virtual L
ic Publicatio
ies in Res
sseminate
cha
ão Paulo Re
ESP), São Pa
iego@fapes
rzyzanowsk
ão Paulo Re
ESP), São Pa
osalyfk@fap
yright © 2013
riz. This work
orted License
ck of Open S
oned commun
lution for Vi
loped a Virt
ips and Sci
ds of Civil
research (t
ublications a
Agency’s sta
the results f
brary has bee
oftware. It w
L as a Relat
Django modu
ngo to Solr
ts software is
already inform
Libraries; Op
ns.
earch Fu
e Informa
esearch
aulo, Brazil
sp.br
ki
esearch
aulo, Brazil
pesp.br
3 by Diego F
k is made av
e: http://crea
Source soluti
nity to suppo
irtual Librari
ual Library
entific Publ
Society, wh
taxes); of A
are available
aff which is a
for each ongo
en developed
was created o
tion Databas
ules develop
easily. FAPE
s now in pro
med on their
pen Source S
nding Ag
ation to Fa
l.
l.
G
V
F
E
In
V
F
E
F. Ucha, Gui
vailable unde
ativecommon
ions when co
ort these info
ies in Fundin
to store and
lications. Th
hich can ac
cademia sin
e in the Wor
able to analy
oing/comple
d since 2004
on top of the
se Managem
ped by the O
PESP Virtual
ocess to be sh
r interest.
Software; Re
gencies: an
aculty, Re
Guilherme G
Virtual Libra
oundation (
E-mail addre
nês Maria d
Virtual Libra
oundation (
E-mail addre
ilherme G. M
er the terms o
ns.org/license
ompared to t
ormation syst
ng Agencies,
d index the m
his Virtual L
ccess referen
nce all the m
rld Wide We
yse subject p
ted research
4, and contin
Django Web
ment System,
Open Source
l Library ach
hared with o
esearch Fund
Submitted
n Open S
esearch T
Giacchetto
ary, São Pau
(FAPESP),
ess: giacche
de Morais I
ary, São Pau
(FAPESP),
ess: immi@
Moreira, Rosa
of the Creativ
es/by/3.0/
the Digital L
tems such as
São Paulo R
metadata for
Library is a
ntial informa
metadata fo
b freely and
patterns in sc
h project.
ues to be up
b Framework
Apache Sol
e community
hieved more
other Funding
ding Agencie
on: February
ource
Teams and
Moreira
ulo Researc
São Paulo,
etto@fapesp
Imperatriz
ulo Researc
São Paulo,
@fapesp.br
aly F. Krzyza
ve Commons
Libraries con
DSpace, Gr
Research Fo
r its funded R
able to contr
ation on the
r scientific
d without lim
cientific resea
pgraded and
k, which is w
lr as a searc
y, such as H
than 4 mill
g Agencies i
es; Research
y 6,
1
d Civil
h
Brazil.
p.br
z
h
Brazil.
anowski,
s
ntext. The
reenstone
undation
Research
ribute in
e results
projects,
mitations;
arch in a
updated,
written in
ch server
Haystack,
lion page
in Brazil,
Projects;
2
1 INTRODUCTION
One of the definitions of Virtual Libraries (VL) is that of web information systems that
provide access to a centralized database of resources (e.g. metadata), that is usually scattered
on a network of systems (Marchiori, 1997). These resources are indexed, organized and
available readily and economically.
On the other hand, Digital Libraries assemble rich digital collections (e.g. full text
documents, manuscripts, high definition photos) (Zhang, 2010). It provides a similar
organization and information retrieval capability when compared to virtual libraries.
The Digital Library context has a well-honed community to support its Open Source
information systems, such as DSpace (Smith, Barton, Bass, Branschofsky, McClellan, Stuve,
Tansley, Walker, 2003), Greenstone (Witten, Boddie, Bainbridge, McNab, 2000) and Fedora
(Staples, Wayland, Payette, 2003). On the other hand, Virtual Libraries lack of Open Source
solutions that adopt state of the art technology in information retrieval, storage,
administration and so on.
As a solution for Virtual Libraries in Funding Agencies, São Paulo Research Foundation
(FAPESP) has developed a public Virtual Library1
to store and index the metadata for its
funded Research Projects, Scholarships and Scientific Publications. This Virtual Library is
able to contribute in addressing the needs of Civil Society, which can access referential
information on the results achieved in funded research (taxes); of Academia since all the
metadata for scientific projects, scholarships and publications are available in the World
Wide Web freely and without limitations; and for the Funding Agency’s staff which is able to
analyse subject patterns in scientific research in a region and to assess the results for each
ongoing/completed research project. This VL system is now in process to be shared with
other Funding Agencies in Brazil, some of them have already informed on their interest.
The next sections will describe: i) how VLs can aid Research Funding Agencies in
accomplishing the scientific information dissemination goal and, more specifically, in São
Paulo Research Foundation (FAPESP) context; ii) the proposed Virtual Library System; iii)
future works to be developed; and iv) the conclusion with results and recommendations.
2 VIRTUAL LIBRARIES (VLS) IN RESEARCH FUNDING AGENCIES (RFAS)
As stated before, VLs provide access to a centralized database of resources. In RFAs, VLs
store metadata for each funded Research Project and/or Scholarship. In some cases, there are
Funding Agencies (e.g. US National Science Foundation and Swiss National Science
Foundation) that are also able to gather, store and display publicly the publications’ metadata
for each funded grant.
The way to gather these publications can be through manual or automatic means. The manual
would consist, for example, of each researcher providing a list of scientific publications;
while the automatic way would require a crawler that searches for patterns in the interested
scientific publications’ repositories/databases. The Swiss National Science Foundation
addresses this task through the manual approach (Swiss National Science Foundation
(SNSF), n.a.), while São Paulo Research Foundation (FAPESP) addresses this task through
the automatic approach, which is detailed in section 3.2.
1
FAPESP Virtual Library website: http://www.bv.fapesp.br/en/
3
Usually, the features available in RFAs VLs are: free and advanced search, search
refinement, result reordering, results per page modification, download search results in text
formats and a link to access each detailed search result. In this sense, RFAs VLs are a
powerful tool to disseminate information, in the World Wide Web, regarding funded
research, since each user is able to retrieve the exact information he needs. Since one of the
main goals of each Research Funding Agency is to provide information to civil society about
the results achieved, these VLs features help to accomplish the information dissemination
goal.
2.1 VL IN SÃO PAULO RESEARCH FOUNDATION (FAPESP)
FAPESP Virtual Library has been developed since 2004, and continues to be upgraded and
updated with open source software. It has achieved more than 4 million page views in 2012,
which is a metric that is rising year after year.
The São Paulo Research Foundation (FAPESP) is one of the leading agencies that fund
scientific research in Brazil, supporting research in all fields of knowledge, scientific
exchange and the dissemination of science and technology. Its mission is to foster scientific
research by awarding scholarships, fellowships and grants to investigators linked to higher
education and research institutions in the State of São Paulo. It was initiated in 1962 and
under the state constitution, 1% of all state taxes are appropriated to fund the Foundation.
Besides accomplishing the goal of providing public information access to the funded research
projects and scholarship, FAPESP VL also assists in the following main tasks:
• Assessment of Research Programs’ grants in a geographical and historical basis.
• Grant candidates evaluation.
• Internationalization assessment.
• Pattern identification in scientific publications resulted from funded grants.
We have identified that by providing public access to the research projects data, through the
VL, the users on the World Wide Web end up using this data in interesting ways, such as i)
by using some part of the research projects abstracts to answer questions in forums and Q&A
tools; ii) a bibliographic reference in Wikipedia’s article; iii) a Curriculum Vitae for funded
researchers.
3 PROPOSED VIRTUAL LIBRARY SYSTEM
The proposed Virtual Library System has been developed to address the needs of a Research
Funding Agency, more specifically to address the needs of São Paulo Research Foundation
(FAPESP).
The adoption of Open Source software in order to build this System has proven to ease the
effort necessary of the development team, since they could debug and, in some cases, tweak
the code to their needs, for each Open Source software adopted (detailed in 3.1). This task is
not so easily accomplished with Proprietary Software, which doesn’t provide the possibility
to analyze its internal source code.
4
3.1 MAIN OPEN SOURCE SOFTWARE SOLUTIONS ADOPTED
The proposed VL system was built on top of Django Web Framework2
, which is written in
Python3
. Both Python and Django have the philosophy of saving the developers’ time in their
daily work. For instance, Django comes with an automatic admin interface (Django Software
Foundation, 2013) and also Python programs end up with fewer lines of code when compared
to other programming languages (Norvig, n.a.). This way, the proposed VL system has an
easier maintenance for the developers’ team. This characteristic is important since systems
tend to grow in features as the time passes by and yet the team will be able to work fast.
MySQL4
was selected as the Relational Database Management System, which is a popular
Open Source software with an active community (Oracle Corporation, 2013). Django has a
built-in integration with MySQL, which makes the deployment easier.
Apache Solr5
was selected as the search platform, in order to boost speed in search results
and to provide the user with state of the art features in information retrieval, such as a
spelling correction, stemming, faceting and filtering. The integration solution between
Apache Solr and Django was Haystack6
for the Django layer and PySolr7
for the Python
layer.
As a way to speed up web page delivery to the user and optimize the server performance,
Memcached8
is adopted as a memory object caching software. It also has an easy integration
with Django, throughout its Cache Backend.
This whole system was deployed in a Linux server using Apache HTTP9
Server through Web
Server Gateway Interface (WSGI).
3.2 SYSTEM ARCHITECTURE
The figure 1 displays the diagram of the system architecture for the proposed Virtual Library.
Each software in this diagram is detailed in section 3.1.
2
Django website: https://www.djangoproject.com/
3
Python website: http://www.python.org/
4
MySQL website: http://www.mysql.com/
5
Apache Solr website: http://lucene.apache.org/solr/
6
Haystack website: http://haystacksearch.org/
7
PySolr website: https://pypi.python.org/pypi/pysolr/
8
Memcached website: http://memcached.org/
9
Hypertext Transfer Protocol (HTTP)
5
Figure 1 – Proposed Virtual Library System Architecture
The adopted version for each software are Apache HTTP Server 2.2, Python 2.6, Django 1.3,
PySolr 2.0.15, MySQLdb 1.2.3, Haystack 1.2.7, Tomcat 6, Apache Solr 3.5, MySQL 5.1 and
Memcached 1.43.
Some of the software solutions above can be replaced by other Open Source solutions. This is
ideal for organizations that already have a well defined software infrastructure. The
components that could be replaced are detailed below.
• Apache HTTP Server + mod_wsgi could be replaced by other HTTP Servers and
WSGI Servers, such as Nginx (HTTP Server)10
+ Gunicorn (WSGI Server)11
.
• MySQL could be replaced by PostgreSQL12
or SQLite13
, since both of them are
supported in Django. For each one, there is a Python Database Binding that would
replace MySQLdb. For PostgreSQL the binding is postgresql_psycopg2 and SQLite
already has a built-in binding in Python 2.6.
• Memcached which stores its data in RAM Memory could be replaced by a database or
a filesystem caching, both of them are available in Django’s Cache Backend.
• Apache Solr + Tomcat + PySolr could be replaced by Haystack’s supported search
engines, which are ElasticSearch14
, Whoosh15
and Xapian16
.
3.3 FEATURES
In the last paragraph of section 2, we have discussed about the common features available in
RFAs VLs. Below are highlighted the specific features available in this proposed VL. The
Open Source solutions described in chapter 3.1 and 3.2 were essential in the development of
the features below.
10
Nginx website: http://nginx.org/
11
Gunicorn webste: http://gunicorn.org
12
PostgreSQL website: http://www.postgresql.org/
13
SQLite website: http://www.sqlite.org/
14
ElasticSearch website: http://www.elasticsearch.org/
15
Whoosh website: https://bitbucket.org/mchaput/whoosh/wiki/Home
16
Xapian website: http://xapian.org/
6
i. Scientific publications gathering
We have developed an automated system that collects scientific publications funded
by FAPESP and available in Web of Science (WoS). This process can be divided in
two steps, as described below.
In the first step, it queries WoS for the entries of FAPESP, acknowledged by the
authors in their publications, in the filter named “Funding Agency”. For each range of
result, it exports a BibTeX format file.
In the second step, the system parses all the BibTeX files and, if not yet available in
the VL, imports the metadata to the VL database. An important metadata field in this
process is the “Grant number”. This field will be the one to create the relationship
between the Grant and the Scientific Publication, i.e. it will be possible to identify the
Scientific Publication as a result of a specific Grant.
This is one of the key processes in a RFA VL, since it will be able to publicly show to
civil society and academia the results achieved by each funded Grant.
ii. Funded researchers’ Curriculum gathering
The majority of Brazilian Research Institutions adopt a Federal Funding Agency
solution for Web Curriculum Vitae called “Plataforma Lattes”17
, in which researchers
are asked to register.
In order to provide more information about each funded researcher, the VL displays a
link to their Lattes. To accomplish this, it was developed an automated system that
queries Lattes for each researcher link. The collected links are then displayed in each
researchers’ individual web page in the VL.
iii. Individual pages for specific metadata fields
In the first version of the proposed VL, the search results for keywords which
represented name of researchers, grant’s knowledge areas or research subjects were
displayed to the user as a standard search result page.
In 2011 and 2012, it was developed the individual pages for these specific metadata
fields. These individual pages summarize the content available in the VL in a more
comprehensive way than a list of search results.
This solution has proven to be a great way to improve information access. As an
example, the researchers’ individual pages already represent more than 21% of the
overall access. The researchers’ pages, for instance, contain the researcher’s short
résumé, his photo, a list of all funded grants where he participated, a list of the most
frequent collaborators in the funded grants, links to Thomson Reuters’s ResearcherID
and Google My Citations.
iv. Heat map of funded grants per city of the State of São Paulo
In each individual page for specific metadata fields, as described in topic iii, a heat
map showing the State of São Paulo is displayed with the identification of grants
concentration in a municipality basis.
17
Plataforma Lattes website: http://lattes.cnpq.br/
7
This feature uses Google Maps API18
to render the map and, in an offline automated
procedure, it collects the latitude and longitude for each State of São Paulo
municipality that has a Research Institution that hosted a Grant.
The map could be centered in other regions of the globe, being only necessary to
provide the latitude, longitude and weight (e.g. quantity of funded grants) for each
highlighted point.
v. Historical view of grants concession throughout the years
This feature shows a dynamic chart, by using Google Charts API, where it displays
the number of grants awarded per year. This feature makes it easier to understand
historical patterns when assessing, for instance, Special Research Programs or specific
Research Subjects. A type of assessment would be to evaluate the evolution or
decrease in an historical window.
vi. Visibility boost in search engines
It was first introduced in 2009 an optimization for search engines of all the pages of
the proposed VL. The optimization is a technique called Search Engine Optimization
(SEO) (Grappone & Couzin, 2010), which focus on adapting the web pages to be
better indexed by search engines’ crawler.
This optimization has boosted the visibility of the VL. The access increase when
comparing 2009 with 2008 was of 896% and comparing 2012 with 2008 it was
2,641%. Each year has registered an expressive increase in absolute numbers.
vii. Internationalization19
As the main feature to disseminate information to a wider range of users on the Web,
the proposed VL system is capable of delivering metadata information in multiple
languages. In FAPESP VL, the adopted languages are Brazilian Portuguese and
English.
One of the supported built-in features in Django is the internationalization process
that makes the translation job easier. All strings that are explicitly marked to be
translated will be copied into a unique file, in a language basis, in order to be
translated by the translator team. Once translated, the developer must compile the
files, in order to enable Django to automatically replace the marked strings in each
web page.
viii. Grant pages’ access statistics available to the RFA’s staff
The RFA’s staff is able to assess the Web access statistics of each Grant available
through the VL. This feature integrates with Google Analytics API to gather the Web
access statistics of the visualized grant’s page and, throughout Google Charts API it
displays the data with dynamic charts.
18
API stands for Application Programming Interface
19
Internationalization as a mechanism to multiple language support in a system
8
This feature is essential to ease the assessment of each page’s trends in information
access in an easy way to a RFA’s staff. It also eliminates the need of training each
individual in web analytics tools.
ix. VL staff’s administration area to create, edit and remove content
The librarian staff is able to create, edit and remove content from the VL using an
administrative area, by authenticating their credentials through a login page.
This administrative area was created using the built-in Django’s features. By
modeling the system, Django is able to generate an automatic admin interface. It also
enables the developers to customize this admin interface as needed, in a project basis.
This built-in feature saved a considerable amount of the developers’ time, since they
didn’t have to develop great part of the admin functionalities.
One example of a developed feature for the VL is the Librarians Production Reports,
in which it is able to assess the day-by-day work of the librarian staff. It also saves the
staff time since they won’t have to keep track of the work done by them in a daily
basis.
x. Sending email alerts to subscribers
In any search result, the user is able to register his email to receive the new grants
entries in the VL. The new entries will only be emailed if they correspond to the
search result’s keywords, provided by the user.
Once the user inputs its interest by registering his email in a specific field on the
interface, the system registers this data, along with the keywords inputted by the
users, as well as the selected refinements, and stores these data on the database. Once
a week, an automated system queries this database and checks for the new grant
entries, in the VL, since the last email alert issued for each user. The new grants are
selected, in a user basis, and the process finishes with the personalized email sending.
4 FUTURE WORK
An important step for the proposed RFA VL System is to be shared freely among other
RFAs. This process is about to start with some RFAs in Brazil, since FAPESP is working to
settle cooperation agreements with these RFAs.
The future work in this system will be focused in implementing more graphical
summarizations of data to ease the decision making process of a RFA’s staff and to ease the
information access to civil society and academia.
5 CONCLUSION
Although the Digital Library context has a well-honed community to support its Open Source
information systems, the Virtual Library context lacks of Open Source solutions that adopt
state of the art technology. A solution to Research Funding Agencies (RFA) Virtual Libraries
(VL) would be the proposed VL in this work, that assembles Open Source solutions that have
a well-honed community of developers in order to deliver a high impact RFA VL to the civil
society, academia and to its staff.
9
6 REFERENCES
Django Software Foundation. (2013). The Django admin site. Retrieved March 13, 2013,
from Django: https://docs.djangoproject.com/en/1.5/ref/contrib/admin/
Grappone, J., & Couzin, G. (2010). Search Engine Optimization (SEO): An Hour a Day.
Indianapolis: Wiley Publishing.
Marchiori, P. Z. (1997). "Ciberteca" ou biblioteca virtual: uma perspectiva de gerenciamento
de recursos de informação. Ciência da Informação, 26(2).
Norvig, P. (n.d.). How to Write a Spelling Corrector. Retrieved March 13, 2013, from
Peter@Norvig.com: http://norvig.com/spell-correct.html
Oracle Corporation. (2013). Download MySQL Community Server. Retrieved March 13,
2013, from MySQL: http://dev.mysql.com/downloads/mysql/
Smith, M., Barton, M., Bass, M., Branschofsky, M., McClellan, G., Stuve, D., et al. (2003).
DSpace: An Open Source Dynamic Digital Repository. D-Lib Magazine, 9(1).
Staples, T., Wayland, R., & Payette, S. (2003). The Fedora Project: An Open-source Digital
Object Repository Management System. D-Lib Magazine, 9(4).
Swiss National Science Foundation (SNSF). (n.d.). Output of research. Retrieved March 15,
2013, from Swiss National Science Foundation (SNSF):
http://www.snf.ch/E/current/Dossiers/Pages/output-of-research.aspx
Witten, I. H., Boddie, S. J., Bainbridge, D., & McNab, R. J. (2000). Greenstone: a
comprehensive open-source digital library software system. Proceedings of the fifth
ACM conference on Digital libraries (pp. 113-121). New York: ACM.
Zhang, Y. (2010). Developing a Holistic Model for Digital Library Evaluation. Journal of the
American Society for Information Science and Technology, 61(1), 88-110.

More Related Content

Similar to Virtual libraries in research funding agencies an open source approach to disseminate information to faculty research teams and civil society

Ifla 2013 virtual libraries in research funding agencies
Ifla 2013   virtual libraries in research funding agenciesIfla 2013   virtual libraries in research funding agencies
Ifla 2013 virtual libraries in research funding agenciesGuilherme GM
 
Subjects Plus: Information Management Tool - A Case Study, with Special Refer...
Subjects Plus: Information Management Tool - A Case Study, with Special Refer...Subjects Plus: Information Management Tool - A Case Study, with Special Refer...
Subjects Plus: Information Management Tool - A Case Study, with Special Refer...Indian Institute of Management Ahmedabad
 
How to open repositories
How to open repositoriesHow to open repositories
How to open repositoriesIryna Kuchma
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...EDINA, University of Edinburgh
 
VIVO at the University of Idaho
VIVO at the University of IdahoVIVO at the University of Idaho
VIVO at the University of Idahoanniegaines
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationJohn Doove
 
Webscale Discovery with the Enduser in Mind
Webscale Discovery with the Enduser in Mind Webscale Discovery with the Enduser in Mind
Webscale Discovery with the Enduser in Mind Debra Kolah
 
L&P Eric Celeste - SHARE
L&P Eric Celeste -  SHAREL&P Eric Celeste -  SHARE
L&P Eric Celeste - SHARECASRAI
 
Presentation2
Presentation2Presentation2
Presentation2fesin82
 
An Analytical Study Of Institutional Digital Repositories In India
An Analytical Study Of Institutional Digital Repositories In IndiaAn Analytical Study Of Institutional Digital Repositories In India
An Analytical Study Of Institutional Digital Repositories In IndiaPedro Craggett
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Robin Rice
 
Next Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformNext Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformTrevor Owens
 
OER World Map Prototypes
OER World Map PrototypesOER World Map Prototypes
OER World Map PrototypesISKME
 
Intro to UCSF Profiles
Intro to UCSF Profiles Intro to UCSF Profiles
Intro to UCSF Profiles lesliey
 

Similar to Virtual libraries in research funding agencies an open source approach to disseminate information to faculty research teams and civil society (20)

Ifla 2013 virtual libraries in research funding agencies
Ifla 2013   virtual libraries in research funding agenciesIfla 2013   virtual libraries in research funding agencies
Ifla 2013 virtual libraries in research funding agencies
 
Subjects Plus: Information Management Tool - A Case Study, with Special Refer...
Subjects Plus: Information Management Tool - A Case Study, with Special Refer...Subjects Plus: Information Management Tool - A Case Study, with Special Refer...
Subjects Plus: Information Management Tool - A Case Study, with Special Refer...
 
Kristi Holmes. A bird’s-eye view of scholarship at the individual, institutio...
Kristi Holmes. A bird’s-eye view of scholarship at the individual, institutio...Kristi Holmes. A bird’s-eye view of scholarship at the individual, institutio...
Kristi Holmes. A bird’s-eye view of scholarship at the individual, institutio...
 
How to open repositories
How to open repositoriesHow to open repositories
How to open repositories
 
Open access (1)
Open access (1)Open access (1)
Open access (1)
 
Open access
Open accessOpen access
Open access
 
Sparling and Cohen "BIBFRAME Implementation at the University of Alberta Libr...
Sparling and Cohen "BIBFRAME Implementation at the University of Alberta Libr...Sparling and Cohen "BIBFRAME Implementation at the University of Alberta Libr...
Sparling and Cohen "BIBFRAME Implementation at the University of Alberta Libr...
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
VIVO at the University of Idaho
VIVO at the University of IdahoVIVO at the University of Idaho
VIVO at the University of Idaho
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundation
 
Webscale Discovery with the Enduser in Mind
Webscale Discovery with the Enduser in Mind Webscale Discovery with the Enduser in Mind
Webscale Discovery with the Enduser in Mind
 
An ontology-based context aware system for Selective Dissemination of Informa...
An ontology-based context aware system for Selective Dissemination of Informa...An ontology-based context aware system for Selective Dissemination of Informa...
An ontology-based context aware system for Selective Dissemination of Informa...
 
Chapter 1,2,3,6
Chapter 1,2,3,6Chapter 1,2,3,6
Chapter 1,2,3,6
 
L&P Eric Celeste - SHARE
L&P Eric Celeste -  SHAREL&P Eric Celeste -  SHARE
L&P Eric Celeste - SHARE
 
Presentation2
Presentation2Presentation2
Presentation2
 
An Analytical Study Of Institutional Digital Repositories In India
An Analytical Study Of Institutional Digital Repositories In IndiaAn Analytical Study Of Institutional Digital Repositories In India
An Analytical Study Of Institutional Digital Repositories In India
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
Next Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformNext Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital Platform
 
OER World Map Prototypes
OER World Map PrototypesOER World Map Prototypes
OER World Map Prototypes
 
Intro to UCSF Profiles
Intro to UCSF Profiles Intro to UCSF Profiles
Intro to UCSF Profiles
 

Recently uploaded

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 

Recently uploaded (20)

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Virtual libraries in research funding agencies an open source approach to disseminate information to faculty research teams and civil society

  • 1. 2013 Virtua appro Societ Diego F Virtual Founda E-mail Rosaly Virtual Founda E-mail Inês M. Attributi Abstra Virtual L latter ha and Fed (FAPES Projects addressi achieved scholars and for region a FAPESP with Op Python, and oth which co views in some of Keywor Scientifi al Librari ach to dis ty Ferreira Uc Library, Sã ation (FAPE address: di Favero Kr Library, Sã ation (FAPE address: ro Copy M. Imperatr ion 3.0 Unpo ct: Libraries lac ave a well-ho dora. As a so SP) has devel s, Scholarsh ing the need d in funded ships and pu the Funding and to assess P Virtual Lib en Source so uses MySQL er Python/D onnects Dja n 2012 and it f them have a rds: Virtual L ic Publicatio ies in Res sseminate cha ão Paulo Re ESP), São Pa iego@fapes rzyzanowsk ão Paulo Re ESP), São Pa osalyfk@fap yright © 2013 riz. This work orted License ck of Open S oned commun lution for Vi loped a Virt ips and Sci ds of Civil research (t ublications a Agency’s sta the results f brary has bee oftware. It w L as a Relat Django modu ngo to Solr ts software is already inform Libraries; Op ns. earch Fu e Informa esearch aulo, Brazil sp.br ki esearch aulo, Brazil pesp.br 3 by Diego F k is made av e: http://crea Source soluti nity to suppo irtual Librari ual Library entific Publ Society, wh taxes); of A are available aff which is a for each ongo en developed was created o tion Databas ules develop easily. FAPE s now in pro med on their pen Source S nding Ag ation to Fa l. l. G V F E In V F E F. Ucha, Gui vailable unde ativecommon ions when co ort these info ies in Fundin to store and lications. Th hich can ac cademia sin e in the Wor able to analy oing/comple d since 2004 on top of the se Managem ped by the O PESP Virtual ocess to be sh r interest. Software; Re gencies: an aculty, Re Guilherme G Virtual Libra oundation ( E-mail addre nês Maria d Virtual Libra oundation ( E-mail addre ilherme G. M er the terms o ns.org/license ompared to t ormation syst ng Agencies, d index the m his Virtual L ccess referen nce all the m rld Wide We yse subject p ted research 4, and contin Django Web ment System, Open Source l Library ach hared with o esearch Fund Submitted n Open S esearch T Giacchetto ary, São Pau (FAPESP), ess: giacche de Morais I ary, São Pau (FAPESP), ess: immi@ Moreira, Rosa of the Creativ es/by/3.0/ the Digital L tems such as São Paulo R metadata for Library is a ntial informa metadata fo b freely and patterns in sc h project. ues to be up b Framework Apache Sol e community hieved more other Funding ding Agencie on: February ource Teams and Moreira ulo Researc São Paulo, etto@fapesp Imperatriz ulo Researc São Paulo, @fapesp.br aly F. Krzyza ve Commons Libraries con DSpace, Gr Research Fo r its funded R able to contr ation on the r scientific d without lim cientific resea pgraded and k, which is w lr as a searc y, such as H than 4 mill g Agencies i es; Research y 6, 1 d Civil h Brazil. p.br z h Brazil. anowski, s ntext. The reenstone undation Research ribute in e results projects, mitations; arch in a updated, written in ch server Haystack, lion page in Brazil, Projects;
  • 2. 2 1 INTRODUCTION One of the definitions of Virtual Libraries (VL) is that of web information systems that provide access to a centralized database of resources (e.g. metadata), that is usually scattered on a network of systems (Marchiori, 1997). These resources are indexed, organized and available readily and economically. On the other hand, Digital Libraries assemble rich digital collections (e.g. full text documents, manuscripts, high definition photos) (Zhang, 2010). It provides a similar organization and information retrieval capability when compared to virtual libraries. The Digital Library context has a well-honed community to support its Open Source information systems, such as DSpace (Smith, Barton, Bass, Branschofsky, McClellan, Stuve, Tansley, Walker, 2003), Greenstone (Witten, Boddie, Bainbridge, McNab, 2000) and Fedora (Staples, Wayland, Payette, 2003). On the other hand, Virtual Libraries lack of Open Source solutions that adopt state of the art technology in information retrieval, storage, administration and so on. As a solution for Virtual Libraries in Funding Agencies, São Paulo Research Foundation (FAPESP) has developed a public Virtual Library1 to store and index the metadata for its funded Research Projects, Scholarships and Scientific Publications. This Virtual Library is able to contribute in addressing the needs of Civil Society, which can access referential information on the results achieved in funded research (taxes); of Academia since all the metadata for scientific projects, scholarships and publications are available in the World Wide Web freely and without limitations; and for the Funding Agency’s staff which is able to analyse subject patterns in scientific research in a region and to assess the results for each ongoing/completed research project. This VL system is now in process to be shared with other Funding Agencies in Brazil, some of them have already informed on their interest. The next sections will describe: i) how VLs can aid Research Funding Agencies in accomplishing the scientific information dissemination goal and, more specifically, in São Paulo Research Foundation (FAPESP) context; ii) the proposed Virtual Library System; iii) future works to be developed; and iv) the conclusion with results and recommendations. 2 VIRTUAL LIBRARIES (VLS) IN RESEARCH FUNDING AGENCIES (RFAS) As stated before, VLs provide access to a centralized database of resources. In RFAs, VLs store metadata for each funded Research Project and/or Scholarship. In some cases, there are Funding Agencies (e.g. US National Science Foundation and Swiss National Science Foundation) that are also able to gather, store and display publicly the publications’ metadata for each funded grant. The way to gather these publications can be through manual or automatic means. The manual would consist, for example, of each researcher providing a list of scientific publications; while the automatic way would require a crawler that searches for patterns in the interested scientific publications’ repositories/databases. The Swiss National Science Foundation addresses this task through the manual approach (Swiss National Science Foundation (SNSF), n.a.), while São Paulo Research Foundation (FAPESP) addresses this task through the automatic approach, which is detailed in section 3.2. 1 FAPESP Virtual Library website: http://www.bv.fapesp.br/en/
  • 3. 3 Usually, the features available in RFAs VLs are: free and advanced search, search refinement, result reordering, results per page modification, download search results in text formats and a link to access each detailed search result. In this sense, RFAs VLs are a powerful tool to disseminate information, in the World Wide Web, regarding funded research, since each user is able to retrieve the exact information he needs. Since one of the main goals of each Research Funding Agency is to provide information to civil society about the results achieved, these VLs features help to accomplish the information dissemination goal. 2.1 VL IN SÃO PAULO RESEARCH FOUNDATION (FAPESP) FAPESP Virtual Library has been developed since 2004, and continues to be upgraded and updated with open source software. It has achieved more than 4 million page views in 2012, which is a metric that is rising year after year. The São Paulo Research Foundation (FAPESP) is one of the leading agencies that fund scientific research in Brazil, supporting research in all fields of knowledge, scientific exchange and the dissemination of science and technology. Its mission is to foster scientific research by awarding scholarships, fellowships and grants to investigators linked to higher education and research institutions in the State of São Paulo. It was initiated in 1962 and under the state constitution, 1% of all state taxes are appropriated to fund the Foundation. Besides accomplishing the goal of providing public information access to the funded research projects and scholarship, FAPESP VL also assists in the following main tasks: • Assessment of Research Programs’ grants in a geographical and historical basis. • Grant candidates evaluation. • Internationalization assessment. • Pattern identification in scientific publications resulted from funded grants. We have identified that by providing public access to the research projects data, through the VL, the users on the World Wide Web end up using this data in interesting ways, such as i) by using some part of the research projects abstracts to answer questions in forums and Q&A tools; ii) a bibliographic reference in Wikipedia’s article; iii) a Curriculum Vitae for funded researchers. 3 PROPOSED VIRTUAL LIBRARY SYSTEM The proposed Virtual Library System has been developed to address the needs of a Research Funding Agency, more specifically to address the needs of São Paulo Research Foundation (FAPESP). The adoption of Open Source software in order to build this System has proven to ease the effort necessary of the development team, since they could debug and, in some cases, tweak the code to their needs, for each Open Source software adopted (detailed in 3.1). This task is not so easily accomplished with Proprietary Software, which doesn’t provide the possibility to analyze its internal source code.
  • 4. 4 3.1 MAIN OPEN SOURCE SOFTWARE SOLUTIONS ADOPTED The proposed VL system was built on top of Django Web Framework2 , which is written in Python3 . Both Python and Django have the philosophy of saving the developers’ time in their daily work. For instance, Django comes with an automatic admin interface (Django Software Foundation, 2013) and also Python programs end up with fewer lines of code when compared to other programming languages (Norvig, n.a.). This way, the proposed VL system has an easier maintenance for the developers’ team. This characteristic is important since systems tend to grow in features as the time passes by and yet the team will be able to work fast. MySQL4 was selected as the Relational Database Management System, which is a popular Open Source software with an active community (Oracle Corporation, 2013). Django has a built-in integration with MySQL, which makes the deployment easier. Apache Solr5 was selected as the search platform, in order to boost speed in search results and to provide the user with state of the art features in information retrieval, such as a spelling correction, stemming, faceting and filtering. The integration solution between Apache Solr and Django was Haystack6 for the Django layer and PySolr7 for the Python layer. As a way to speed up web page delivery to the user and optimize the server performance, Memcached8 is adopted as a memory object caching software. It also has an easy integration with Django, throughout its Cache Backend. This whole system was deployed in a Linux server using Apache HTTP9 Server through Web Server Gateway Interface (WSGI). 3.2 SYSTEM ARCHITECTURE The figure 1 displays the diagram of the system architecture for the proposed Virtual Library. Each software in this diagram is detailed in section 3.1. 2 Django website: https://www.djangoproject.com/ 3 Python website: http://www.python.org/ 4 MySQL website: http://www.mysql.com/ 5 Apache Solr website: http://lucene.apache.org/solr/ 6 Haystack website: http://haystacksearch.org/ 7 PySolr website: https://pypi.python.org/pypi/pysolr/ 8 Memcached website: http://memcached.org/ 9 Hypertext Transfer Protocol (HTTP)
  • 5. 5 Figure 1 – Proposed Virtual Library System Architecture The adopted version for each software are Apache HTTP Server 2.2, Python 2.6, Django 1.3, PySolr 2.0.15, MySQLdb 1.2.3, Haystack 1.2.7, Tomcat 6, Apache Solr 3.5, MySQL 5.1 and Memcached 1.43. Some of the software solutions above can be replaced by other Open Source solutions. This is ideal for organizations that already have a well defined software infrastructure. The components that could be replaced are detailed below. • Apache HTTP Server + mod_wsgi could be replaced by other HTTP Servers and WSGI Servers, such as Nginx (HTTP Server)10 + Gunicorn (WSGI Server)11 . • MySQL could be replaced by PostgreSQL12 or SQLite13 , since both of them are supported in Django. For each one, there is a Python Database Binding that would replace MySQLdb. For PostgreSQL the binding is postgresql_psycopg2 and SQLite already has a built-in binding in Python 2.6. • Memcached which stores its data in RAM Memory could be replaced by a database or a filesystem caching, both of them are available in Django’s Cache Backend. • Apache Solr + Tomcat + PySolr could be replaced by Haystack’s supported search engines, which are ElasticSearch14 , Whoosh15 and Xapian16 . 3.3 FEATURES In the last paragraph of section 2, we have discussed about the common features available in RFAs VLs. Below are highlighted the specific features available in this proposed VL. The Open Source solutions described in chapter 3.1 and 3.2 were essential in the development of the features below. 10 Nginx website: http://nginx.org/ 11 Gunicorn webste: http://gunicorn.org 12 PostgreSQL website: http://www.postgresql.org/ 13 SQLite website: http://www.sqlite.org/ 14 ElasticSearch website: http://www.elasticsearch.org/ 15 Whoosh website: https://bitbucket.org/mchaput/whoosh/wiki/Home 16 Xapian website: http://xapian.org/
  • 6. 6 i. Scientific publications gathering We have developed an automated system that collects scientific publications funded by FAPESP and available in Web of Science (WoS). This process can be divided in two steps, as described below. In the first step, it queries WoS for the entries of FAPESP, acknowledged by the authors in their publications, in the filter named “Funding Agency”. For each range of result, it exports a BibTeX format file. In the second step, the system parses all the BibTeX files and, if not yet available in the VL, imports the metadata to the VL database. An important metadata field in this process is the “Grant number”. This field will be the one to create the relationship between the Grant and the Scientific Publication, i.e. it will be possible to identify the Scientific Publication as a result of a specific Grant. This is one of the key processes in a RFA VL, since it will be able to publicly show to civil society and academia the results achieved by each funded Grant. ii. Funded researchers’ Curriculum gathering The majority of Brazilian Research Institutions adopt a Federal Funding Agency solution for Web Curriculum Vitae called “Plataforma Lattes”17 , in which researchers are asked to register. In order to provide more information about each funded researcher, the VL displays a link to their Lattes. To accomplish this, it was developed an automated system that queries Lattes for each researcher link. The collected links are then displayed in each researchers’ individual web page in the VL. iii. Individual pages for specific metadata fields In the first version of the proposed VL, the search results for keywords which represented name of researchers, grant’s knowledge areas or research subjects were displayed to the user as a standard search result page. In 2011 and 2012, it was developed the individual pages for these specific metadata fields. These individual pages summarize the content available in the VL in a more comprehensive way than a list of search results. This solution has proven to be a great way to improve information access. As an example, the researchers’ individual pages already represent more than 21% of the overall access. The researchers’ pages, for instance, contain the researcher’s short résumé, his photo, a list of all funded grants where he participated, a list of the most frequent collaborators in the funded grants, links to Thomson Reuters’s ResearcherID and Google My Citations. iv. Heat map of funded grants per city of the State of São Paulo In each individual page for specific metadata fields, as described in topic iii, a heat map showing the State of São Paulo is displayed with the identification of grants concentration in a municipality basis. 17 Plataforma Lattes website: http://lattes.cnpq.br/
  • 7. 7 This feature uses Google Maps API18 to render the map and, in an offline automated procedure, it collects the latitude and longitude for each State of São Paulo municipality that has a Research Institution that hosted a Grant. The map could be centered in other regions of the globe, being only necessary to provide the latitude, longitude and weight (e.g. quantity of funded grants) for each highlighted point. v. Historical view of grants concession throughout the years This feature shows a dynamic chart, by using Google Charts API, where it displays the number of grants awarded per year. This feature makes it easier to understand historical patterns when assessing, for instance, Special Research Programs or specific Research Subjects. A type of assessment would be to evaluate the evolution or decrease in an historical window. vi. Visibility boost in search engines It was first introduced in 2009 an optimization for search engines of all the pages of the proposed VL. The optimization is a technique called Search Engine Optimization (SEO) (Grappone & Couzin, 2010), which focus on adapting the web pages to be better indexed by search engines’ crawler. This optimization has boosted the visibility of the VL. The access increase when comparing 2009 with 2008 was of 896% and comparing 2012 with 2008 it was 2,641%. Each year has registered an expressive increase in absolute numbers. vii. Internationalization19 As the main feature to disseminate information to a wider range of users on the Web, the proposed VL system is capable of delivering metadata information in multiple languages. In FAPESP VL, the adopted languages are Brazilian Portuguese and English. One of the supported built-in features in Django is the internationalization process that makes the translation job easier. All strings that are explicitly marked to be translated will be copied into a unique file, in a language basis, in order to be translated by the translator team. Once translated, the developer must compile the files, in order to enable Django to automatically replace the marked strings in each web page. viii. Grant pages’ access statistics available to the RFA’s staff The RFA’s staff is able to assess the Web access statistics of each Grant available through the VL. This feature integrates with Google Analytics API to gather the Web access statistics of the visualized grant’s page and, throughout Google Charts API it displays the data with dynamic charts. 18 API stands for Application Programming Interface 19 Internationalization as a mechanism to multiple language support in a system
  • 8. 8 This feature is essential to ease the assessment of each page’s trends in information access in an easy way to a RFA’s staff. It also eliminates the need of training each individual in web analytics tools. ix. VL staff’s administration area to create, edit and remove content The librarian staff is able to create, edit and remove content from the VL using an administrative area, by authenticating their credentials through a login page. This administrative area was created using the built-in Django’s features. By modeling the system, Django is able to generate an automatic admin interface. It also enables the developers to customize this admin interface as needed, in a project basis. This built-in feature saved a considerable amount of the developers’ time, since they didn’t have to develop great part of the admin functionalities. One example of a developed feature for the VL is the Librarians Production Reports, in which it is able to assess the day-by-day work of the librarian staff. It also saves the staff time since they won’t have to keep track of the work done by them in a daily basis. x. Sending email alerts to subscribers In any search result, the user is able to register his email to receive the new grants entries in the VL. The new entries will only be emailed if they correspond to the search result’s keywords, provided by the user. Once the user inputs its interest by registering his email in a specific field on the interface, the system registers this data, along with the keywords inputted by the users, as well as the selected refinements, and stores these data on the database. Once a week, an automated system queries this database and checks for the new grant entries, in the VL, since the last email alert issued for each user. The new grants are selected, in a user basis, and the process finishes with the personalized email sending. 4 FUTURE WORK An important step for the proposed RFA VL System is to be shared freely among other RFAs. This process is about to start with some RFAs in Brazil, since FAPESP is working to settle cooperation agreements with these RFAs. The future work in this system will be focused in implementing more graphical summarizations of data to ease the decision making process of a RFA’s staff and to ease the information access to civil society and academia. 5 CONCLUSION Although the Digital Library context has a well-honed community to support its Open Source information systems, the Virtual Library context lacks of Open Source solutions that adopt state of the art technology. A solution to Research Funding Agencies (RFA) Virtual Libraries (VL) would be the proposed VL in this work, that assembles Open Source solutions that have a well-honed community of developers in order to deliver a high impact RFA VL to the civil society, academia and to its staff.
  • 9. 9 6 REFERENCES Django Software Foundation. (2013). The Django admin site. Retrieved March 13, 2013, from Django: https://docs.djangoproject.com/en/1.5/ref/contrib/admin/ Grappone, J., & Couzin, G. (2010). Search Engine Optimization (SEO): An Hour a Day. Indianapolis: Wiley Publishing. Marchiori, P. Z. (1997). "Ciberteca" ou biblioteca virtual: uma perspectiva de gerenciamento de recursos de informação. Ciência da Informação, 26(2). Norvig, P. (n.d.). How to Write a Spelling Corrector. Retrieved March 13, 2013, from Peter@Norvig.com: http://norvig.com/spell-correct.html Oracle Corporation. (2013). Download MySQL Community Server. Retrieved March 13, 2013, from MySQL: http://dev.mysql.com/downloads/mysql/ Smith, M., Barton, M., Bass, M., Branschofsky, M., McClellan, G., Stuve, D., et al. (2003). DSpace: An Open Source Dynamic Digital Repository. D-Lib Magazine, 9(1). Staples, T., Wayland, R., & Payette, S. (2003). The Fedora Project: An Open-source Digital Object Repository Management System. D-Lib Magazine, 9(4). Swiss National Science Foundation (SNSF). (n.d.). Output of research. Retrieved March 15, 2013, from Swiss National Science Foundation (SNSF): http://www.snf.ch/E/current/Dossiers/Pages/output-of-research.aspx Witten, I. H., Boddie, S. J., Bainbridge, D., & McNab, R. J. (2000). Greenstone: a comprehensive open-source digital library software system. Proceedings of the fifth ACM conference on Digital libraries (pp. 113-121). New York: ACM. Zhang, Y. (2010). Developing a Holistic Model for Digital Library Evaluation. Journal of the American Society for Information Science and Technology, 61(1), 88-110.