E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3

                                              a v a i l a b l ...
24                                          E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3

Using real data to...
E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3                                          25

26                                          E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3

     desired scena...
E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3                                       27

5. Use Scenario Control...
28                                          E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3

E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3                                          29

Fig. 5 – Results o...
30                                           E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3

E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3                                      31

Arrangements for interop...
32                                               E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3

    The tests...
E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3                                        33

Santana, F.S., Siqueir...
Upcoming SlideShare
Loading in …5

Biodiversity and climate change use scenarios framework for ...


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Biodiversity and climate change use scenarios framework for ...

  1. 1. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3 a v a i l a b l e a t w w w. s c i e n c e d i r e c t . c o m w w w. e l s e v i e r. c o m / l o c a t e / e c o l i n f Biodiversity and climate change use scenarios framework for the GEOSS interoperability pilot process Stefano Nativi a,⁎, Paolo Mazzetti a , Hannu Saarenmaa b , Jeremy Kerr c , Éamonn Ó Tuamad a Italian National Research Council (CNR)-IMAA, C.da S. Loja, Zona industriale-Tito Scalo, 85050 Italy b Finnish Museum of Natural History, 00014 University of Helsinki, Finland c Canadian Facility for Ecoinformatics Research, Department of Biology, University of Ottawa, Box 450, Station A, Ottawa, ON, Canada K1N6N5 d Global Biodiversity Information Facility Secretariat, Universitetsparken 15, 2100 Copenhagen, Denmark AR TIC LE D ATA ABSTR ACT Article history: Climate change threatens to commit 15–37% of species to extinction by 2050. There is a clear Received 22 August 2008 need to support policy-makers analyzing and assessing the impact of climate change along Received in revised form with land use changes. This requires a megascience infrastructure that is capable of 25 November 2008 discovering and integrating enormous volumes of multi-disciplinary data, i.e. data from Accepted 26 November 2008 biodiversity, earth observation, and climatic archives. Metadata and services interoperability is necessary. The Global Earth Observation System of Systems (GEOSS) Keywords: works to realize such an interoperability infrastructure based on systems architecture Biodiversity standardization. In this paper we describe the results of linking the infrastructures of Climate Change Climate Change research and Biodiversity research together using the approach envisioned SOA (Service Oriented Architecture) by GEOSS. In fact, we present and discuss a service-oriented framework which was applied Mediation services to implement and demonstrate the Climate Change and Biodiversity use scenario of the GEOSS (Global Earth Observation GEOSS Interoperability Process Pilot Project (IP3). This interoperability is done for the System of Systems) purpose of enabling scientists to do large-scale ecological analysis. We describe a generic IP3 (Interoperability Process use scenario and related modelling workbench that implement an environment for Pilot Project) studying the impacts of climate change on biodiversity. The Service Oriented Architecture Megascience infrastructure framework, which realizes this environment, is described. Its standard-based components Macroecological research and services, according to GEOSS requirements, are discussed. This framework was successfully demonstrated at the GEO IV Ministerial Meeting in Cape Town, South Africa November 2007. © 2008 Elsevier B.V. All rights reserved. 1. Introduction lighted by the Fourth Assessment Report of the Intergovern- mental Panel for Climate Change (IPCC, 2007). Climate change threatens to commit 15–37% of species to Such analyses require robust infrastructure capable of extinction by 2050 (Thomas et al., 2004; also Buckley and integrating enormous volumes of data from biodiversity Roughgarden 2004; Harte et al., 2004), accelerating a mass archives, satellite remote sensing, and climatic data. The extinction precipitated by widespread land use changes. The integration is a stepwise process, where careful definition of a need to assess these impacts and recommend solutions to series of specific applications for these data (“use scenarios”), policy-makers is correspondingly acute and has been high- including step-by-step processes for analyses, are required. ⁎ Corresponding author. E-mail addresses: nativi@imaa.cnr.it (S. Nativi), mazzetti@imaa.cnr.it (P. Mazzetti), hannu.saarenmaa@helsinki.fi (H. Saarenmaa), jkerr@uottawa.ca (J. Kerr), eotuama@gbif.org (É. Ó Tuama). 1574-9541/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.ecoinf.2008.11.002
  2. 2. 24 E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3 Using real data to solve specific problems, these use scenarios Service Oriented Architecture (SOA) as a framework, and can be transformed into use cases and implemented within an describe the flow of steps as a “business process” using SOA interoperable, distributed component architecture. Different terminology and build on the OpenModeller technology. In parts of that infrastructure have already been established this paper, we extend the scope, and present an SOA independently by the IPCC and by the Global Biodiversity architectural solution developed for analyzing the impact of Information Facility (GBIF) in their own areas, respectively. climate change on biodiversity, including the required use However, linking major infrastructures in separate scenario. Because acquiring and processing environmental domains together will require additional sources of metadata data is a crucial step in this analysis, we describe the SOA and related infrastructure services. That is, unless the framework and the developed architecture components and integration is done using customized point-to-point protocols services which are standard-based according to GEOSS where provider and user know each other but third parties are requirements. We present how these were used in the GEO excluded. The Global Earth Observation System of Systems IV Summit in Cape Town in November 2007 for on-the-fly data (GEOSS) now promises to make these disparate services discovery and selection (Nativi et al., 2007b). Finally, we available through its Clearinghouse registry of registries describe the system and the experiments which are part of system. Interoperability Pilot Process (IP3). This is an action of the The Group on Earth Observations (GEO) currently includes GEOSS AR-07-01 Task (GEO, 2007–2009). Several components 76 member countries, the European Commission, and 51 and services are already registered in the GEOSS registers. We intergovernmental, international, and regional organizations. believe that presentation of this work can help inform the GEO envisioned a “system of systems” to help realize a future ecological and biodiversity community of the importance of wherein decisions and actions for the benefit of humankind GEOSS for efficient macroecological research. are informed via coordinated, comprehensive and sustained Earth observations and information (GEO, 2005). The GEOSS 1.1. The Species Response to Climate Change use scenarios Implementation Plan spans 10 years (2005–2015) and recog- for GEOSS IP3 nises nine Societal Benefit Areas (SBAs) including Climate, Ecosystems and Biodiversity. The GEOSS strategy consists of The presented SOA framework was applied to implement and leveraging existing systems and services and promoting demonstrate the Climate Change and Biodiversity use sce- interoperability through the adoption of a Service Oriented nario of the GEOSS IP3. Architecture (SOA) framework approach based on established The Interoperability Process Pilot Project (IP3) is part of the standards from bodies such as the International Organization GEOSS task AR-07-01 (GEO, 2007–2009) aiming to prototype and for Standardization (ISO) and Open Geospatial Consortium validate the implementation of the “Core” GEOSS infrastruc- (OGC). ture and the processes for contributing and linking systems. In this paper, we describe the results of linking the The IP3 was conceived as a way to exercise the process that has infrastructures of Climate Change research and Biodiversity been defined for reaching interoperability arrangements research together using an approach that is compatible with (Khalsa et al., submitted for publication). IP3 helps to identify the GEOSS service-oriented framework. This interoperability the system components and discuss the standards, interface is done for the purpose of enabling scientists to do large- protocols and interoperability agreements currently used by scale ecological analysis. We describe a generic use scenario disciplinary systems, such as GBIF and IPCC. and related modelling workbench that implement an envir- IP3 developed a series of projects involving different SBAs, onment for studying the impacts of climate change on working out a suite of demonstrations. Four systems/disciplines biodiversity. were initially identified as sources for the pilot project, covering The most widely used approach for describing the steps for Earth's water cycle, climate, seismology, and biodiversity large-scale biodiversity data analysis is Ecological Niche (Khalsa et al., submitted for publication). One of them, namely: Modelling (ENM), pioneered by Peterson et al. (2001, 2002) “Species Response to Climate Change” was developed into and refined subsequently by many others (e.g. Elith et al., functional demonstrations building on the presented frame- 2006). ENM is now employed for a range of global change and work: Ecological Niche Modelling was used to predict present- macroecological applications (e.g. White and Kerr 2007; Kerr day niches for different species (e.g. butterflies in Canada and et al., 2007; Kharouba et al., in press). GBIF has promoted this Alaska and pikas in the North-West America) and then to approach and organised several international workshops on predict their shifts under different global and regional climate the topic.1 The modelling tools for ENM have diversified and change scenarios. These demonstrations were presented at the are being made available as an open framework and web GEO IV Ministerial Meeting in Cape Town, South Africa services2 through the OpenModeller project3 (see Canhos et al., November 2007 (Nativi et al., 2007b). 2004). The steps that are required for ENM have recently been described in detail by Santana et al. (2008). They use the 2. Scenarios definition In the following, we briefly delineate the steps, with accom- 1 panying data needs, in a simple scenario intended to provide http://www.gbif.org/prog/ocb/modeling_workshop. 2 http://openmodeller.cria.org.br/wikis/omgui/Use_Case_Scenario_ example outcomes for a topical purpose, namely predicting for_Open_Modeller_Web_Services_API. shifts in the spatial distribution of species' niches as a 3 http://openmodeller.sourceforge.net/. consequence of climate change. Each step in the process we
  3. 3. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3 25 Fig. 1 – Activity diagram for the use scenario of biodiversity and climate change. outline reflects the business processes for ENM (Santana et al., those gaps. The need for more and better data can be 2008; Fig. 1). They are: communicated to policy makers. 3. Determine what environmental characteristics most likely 1. Identify taxa for which sufficient data exist to conduct influence target species' niches. Examples of data that are broad-scale analyses aimed at predicting the impacts of most widely necessary include high resolution land cover and global change on species distributions in the future. It is climate data. Climate and land use change models can help useful for such data to have a historical dimension also, forecast future environmental conditions, but models of reaching back 30 years or more, so that responses to future conditions are unlikely to match either present-day, recently observed climatic and land use changes can be spatial observations of climatic or weather, or of land use. The documented. Predictions for future niche shifts are likely to latter can be observed remote using very high resolution be more accurate when limited to species that have recently satellite data (see Kerr and Ostrovsky, 2003) or in situ. Clearly, responded predictably to climate changes that have been models of the future are subject to relatively large uncertain- directly observed (Kharouba et al., in press). Although there ties but can nevertheless provide plausible forecasts of change are many biodiversity datasets that satisfy these stringent that can, and should, be considered for planning purposes. criteria, they are patchily distributed (e.g. birds from the 4. Determine what climatological data are needed for Ecolo- United Kingdom, butterflies from Canada, etc.). ENM can be gical Niche Modelling of the selected group of organisms for applied to these datasets. Identifying other datasets is a past, current, or future scenarios. challenge but one that GBIF can help solve. If all required 5. Determine which modelling algorithms will most accurately datasets are stored in a repository online, then data mining and precisely predict shifts in distribution and abundance for techniques can be used to discover available, comprehen- the selected group of organisms. Identify the reporting needs sive datasets. If caching or other central or distributed in terms of data accuracy and error propagation. repositories are inaccessible or do not exist, expert advice 6. Collect the selected species occurrence data (e.g. from may still successfully identify needed datasets. GBIF), environmental and climate data (e.g. from IPCC) to 2. After assembling biodiversity datasets and mapping their the modelling workbench. spatial and temporal distributions, gaps in information 7. Run the models and present outputs as series of maps and become clearer. These gaps can then imply new data predicted abundance numbers. Model accuracy should be sharing opportunities within and among countries to fill in tested so uncertainty in model outputs under the range of
  4. 4. 26 E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3 desired scenarios can be included to provide a realistic Fig. 2 shows the overall system architecture. It consists of depiction of policy options. This step will eliminate model six main logical components: outputs that are clearly inaccurate and consequently minimize the likelihood that failed models will inadver- 1. Biodiversity Data Provider: a component providing biodi- tently influence policy. This approach resembles that of the versity data. It supports two logical operations: a) getting IPCC is presenting different climate change scenarios, an index of available datasets; b) getting data of a specific depending on variations in emission reduction efforts. dataset. 2. Climatological Data Provider: a component providing The above scenario is but one example of a broad-scale climatological data. It supports two logical operations: application for biodiversity data. Biodiversity is also affected a) collecting an index of available datasets; by other factors such as tropical deforestation, for which other b) collecting specific data, after a suitable target dataset is scenarios can be produced. identified. Biodiversity is not only being impacted, but is also an 3. Catalog: a component performing queries on the available essential component in providing ecosystem services for biodiversity and climatological datasets. It supports search agriculture, health, the chemical industry, etc. However, operations. Such operation can be very complex, applying these additional scenarios can be foreseen to build on the different kinds of filters based on spatial and/or temporal same pool of primary biodiversity data as the described criteria. It performs search operation using indices from climate change scenario. known data providers. This catalog implements distribu- tion and mediation functionalities (i.e. distribution and mediation for heterogeneous protocols, interaction style 3. The framework interface type, information model) through the same service interfaces. It implements a broker service which As explained above, the typical biodiversity application supports extended interfaces for asynchronous query scenarios require modelling the impact of climate change on distribution and caching. Experience with initial imple- species distribution. To build such models within a distributed mentations of the GEOSS architecture components has computing environment, heterogeneous data resources (e.g. demonstrated the importance of a brokering service in biodiversity, climatic and other environmental resources) and order to facilitate discovery across the GEOSS federation. processing services (e.g. implementing ENM algorithms) must The mediation role applies to interoperability across interoperate seamlessly. We have developed and thoroughly catalog services provided by the GEOSS Climate Change, tested a conceptual framework to permit interoperability Biodiversity and other environmental communities. testing for biodiversity applications. This framework also 4. Model Provider: a component that runs ENM techniques on allows testing the GEOSS service architecture through the selected biodiversity and climatological datasets. It sup- development of relevant scenarios that draw on data and ports a main operation to run the model by specifying the information exchange from a series of systems intercon- algorithm, the parameter values, and the datasets to be nected through SOA and by applying established standards. used. Fig. 2 – The logical architecture of the framework.
  5. 5. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3 27 5. Use Scenario Controller: a controller component that The catalog service is provided by the GI-Cat component that implements workflow within the business process typical accesses the data servers' metadata/indexing interfaces to of the biodiversity scenario described above. It is controlled perform queries (Nativi et al., 2007a; Bigagli et al., 2006). It by the user through the GUI. implements and can be accessed through a standard OGC CS-W 6. Graphical User Interface (GUI): The component for user interface (OGC, 2007a,b). interaction. It controls the workflow manager to perform The OpenModeller component implements the Model the required operations for implementing the biodiversity Provider. It is able to run ENM according to different basic scenario. algorithms and parameters. It exposes a proprietary SOAP interface. Since it can work only on local files, it is necessary to These components play the three typical roles of a SOA upload all required data locally. To avoid a double transfer where Consumers discover Providers through a Registry. In our operation we added a Data Uploader component. This exposes framework Data and Model providers are the Service Provi- a simple web interface that accepts a data description, ders; the GUI-Controller pair acts as a Consumer and the including all the information required for accessing data. Catalog plays the role of the Registry. Where necessary it also When a description is sent, the Data Uploader provides for the acts as a Broker between Consumer and Providers. This fourth retrieval of the data and for local storage. Thus the logical component is necessary for heterogeneous and federated interaction between the Controller and the Providers for data systems. access (see Fig. 2) is implemented with an indirect interaction The previous logical architecture has been implemented through the Data Uploader. using a layered web architecture with a Service-Oriented The Controller component implements the business pro- approach selecting or deploying specific data and model cess of the use scenarios. According to the instruction providers, and introducing new components where required. provided by the user through the GUI, the Controller accesses The functioning system includes multiple interacting the Catalog and Model Provider for searching, evaluating and components and implements simple user interfaces (Fig. 3). choosing data, and for running models. The GBIF Portal Server and the Climatological Data Server are the data providers. Each of them has instances of interfaces for accessing metadata and data. The GBIF Portal Server 4. Test scenario implements a REST-based interface to retrieve taxonomic information and species occurrences data through HTTP-GET A first demonstration dealt with the Canadian butterfly operations directed on specific resources addressed by proper species (Amblyscirtes vialis) and its response to climate change. URLs. The Climatological Data Provider implements an OGC This demonstration was presented at the GEOSS IV Ministerial WCS interface (OGC, 2005) providing functionalities for Summit as part of the achievements of the GEOSS IP3 for the retrieving index and metadata (i.e. getCapabilities and descri- Biodiversity and Climate Change SBAs (Species Response to beCoverage) and data (i.e. getCoverage). Climate Change use scenarios) (Nativi et al., 2007b). Fig. 3 – The framework main components.
  6. 6. 28 E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3 Fig. 4 – The framework deployment architecture. The deployment architecture realized for the demonstra- In the following paragraphs the main components are tion is formalized by the schema depicted in Fig. 4. The GBIF described in more detail highlighting the technological con- Portal Node is the instance of the GBIF Data Portal Server, while straints and choices. a NCAR climatological server node hosts the Climatological Data Server. A Catalog Server Node located at the CNR-IMAA 4.1. Biodiversity Data Provider runs an instance of GI-Cat configured for returning CS-W responses according to the ISO profile (OGC, 2007b). Another Biodiversity occurrences are discovered and accessed through Node located at CNR-IMAA hosts the OpenModeller server web services published by the GBIF Data Portal5 and using instance and the Data Uploader components — they must widely deployed biodiversity standards.6 reside on the same Node. The GBIF Data Portal provides unified access to over The other interacting Node is the User Device which is 151 million primary species-occurrence records (both speci- typically a device capable of running a Web Browser and a Java mens and observations) from some 266 data providers around Virtual Machine, such as a desktop or laptop computer. In the the world, and covering a diverse range of taxa and ecosys- browser, it runs the Use Scenario application allowing the data tems (Hobern and Saarenmaa, 2005). A high proportion of uploading, the model description and running, and the data these records are geo-referenced, and ongoing efforts in the visualization output. The search operations are performed data providing communities stress the necessity and value of using a Java-based client of GI-cat (called GI-go GeoBrowser4) providing an accurate geo-location for records. The GBIF for performances issues. virtual database represents a unique resource for Earth Observation studies which require ground-truthing data, 4 5 http://zeus.pin.unifi.it/joomla/index.php?option=com_content&- http://data.gbif.org/. 6 task=view&id=12&Itemid=59. http://wiki.tdwg.org/twiki/bin/view/DarwinCore/WebHome.
  7. 7. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3 29 Fig. 5 – Results of the model demonstrated at the GEO Ministerial Meeting in Cape Town, South Africa November 2007. The model and its projections are the result of successful interoperation of all components of the system. A) The Amblyscirtes vialis distribution projected for the year 2000; B) The Amblyscirtes vialis distribution projected for the year 2050 under the IPCC climate change scenario. Light marks correspond to 100% of probability; gray marks to 50% of probability (photo by Erik Nielsen). whether historical (to study change over time) or contempor- 3. Google Earth mapping service providing 1-degree cell ary. In addition to the web based interface which provides the density data or placement marks. user with three main routes into the data served by the GBIF 4. Prototype OGC compliant Web Map Service. network – a user can explore by species, by country or by dataset with options to download the data – GBIF also exposes GBIF works closely with Biodiversity Information Stan- the data through several web services. These are described in dards (BIS) /TDWG,7 an international organisation that de- the following section. velops standards and protocols for sharing biodiversity data. The main components of the network contributed by GBIF Foremost amongst these, and deployed widely in the GBIF are: network are the following: 1. Data providing nodes — currently some 266 distributed 1. Darwin Core8: a standard designed to facilitate the around the world and growing. exchange of information about the geographic occurrence 2. A central registry of the data providing nodes — imple- of species and the existence of specimens in collections. It mented using UDDI. includes an extension mechanism to allow inclusion of 3. A central indexing and caching system of the data provided other information. Its geospatial extension is particularly by the nodes. relevant for GEOSS applications. 4. A data portal front end providing unified access to all nodes 2. ABCD Schema9: (Access to Biological Collection Data), more on the network. comprehensive than Darwin Core, this is also designed to 5. Web services for programmatic access to data on the promote accessibility to biological collection data. network. 3. DiGIR10: Distributed Generic Information Retrieval, based on HTTP, XML and UDDI, is a protocol designed for unified The GBIF data portal provides a number of web services: access to distributed databases. 4. TAPIR11: (TDWG Access Protocol for Information Retrieval) 1. A registry of data providing nodes implemented using is a newer HTTP/XML based protocol standard developed SOAP to UDDI. by BIS/TDWG for accessing structured data stored in 2. Several related REST style web services for data resources distributed databases. It combines and extends the fea- within the GBIF network, including: tures of BioCASe (a protocol based on DiGIR and developed 1. Taxon data web service: providing access to records of for the EU funded project BioCASE for use with ABCD taxon concepts. encoded data) and DiGIR to provide a more generic 2. Occurrence record data web service: providing access to protocol. records of the occurrence of organisms. 7 http://www.tdwg.org/. 3. Occurrence density data web service: providing access to 8 http://www.tdwg.org/activities/darwincore/. records showing the density of occurrence records by 9 http://www.tdwg.org/activities/abcd/. one-degree cell. 10 http://digir.sourceforge.net/http://www.tdwg.org/activities/tapir/. 4. Provider web service: providing access to records 11 http://www.tdwg.org/dav/subgroups/tapir/1.0/docs/TAPIR describing the data providers. Specification_2008-09-18.html.
  8. 8. 30 E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3 Fig. 6 – Client application: user-interface. 4.2. Climatological Data Provider oil and gas) availability, rapid pace and direction of techno- logical change favoring balanced development. Climatological data were obtained from the NCAR GIS portal12 A1B Scenario Run set is represented by the five ensemble which provides web access to free global datasets of climate members. Climate models are an imperfect representation of change scenarios. These data (spanning 50 years from 2000 to the earth's climate system and climate modellers employ a 2050) have been generated for the 4th Assessment Report of technique called ensembling to capture the range of possible the Intergovernmental Panel on Climate Change (IPCC) by the climate states. A climate model run ensemble consists of two Community Climate System Model (CCSM) (IPCC, 2007). This or more climate model runs made with the exact same service can be discovered using the GEOSS Clearinghouse climate model, using the exact same boundary forcings, (include URL?). where the only difference between the runs is the initial The portal provides several climate change scenarios, as conditions. provided by IPCC: a scenario is a description of a possible outlook The datasets are processed to generate grid coverages at 1° for the future state of the world, not a forecast of the future. The resolution in the ESRI ARCGrid format and served through the constant 20th century forcing shows the least increase in future standard OGC WCS (Web Coverage Service) interface version surface temperature, the B1 and A1B scenarios displays moderate 1.0 (OGC, 2005). Fig. 5 depicts the results obtained for a use case increases and the A2 scenario results in the largest response. dealing with the Canadian common roadside skipper butterfly The interoperability experiments mainly considered the (A. vialis). This use case was demonstrated at the GEO A1B scenario. The A1 storyline and scenario family describes a Ministerial Meeting in Cape Town, South Africa November future world of very rapid and successful economic develop- 2007 (Nativi et al., 2007b). ment, low population growth, and the rapid introduction of new and more efficient technologies. Major underlying 4.3. Catalog service themes are convergence among regions, capacity building and increased cultural and social interactions, with a sub- GI-cat (Bigagli et al., 2004) is a distributed catalog providing a stantial reduction in regional differences in per capita income. unique and consistent interface that enables the interrogation The A1 scenario family develops into four groups that describe of biodiversity and climatological data resources. GI-cat alternative directions of technological change in the energy exposes an OGC CS-W/ISO interface (OGC, 2007b) and is able system. Main characteristics of A1B scenario include: low to federate heterogeneous catalogs and access servers that population growth, very high GDP growth, very high energy implement international geospatial standards (e.g. OGC OWS). use, low–medium land use changes, medium resource (mainly In addition, GI-cat implements a mediation server, making it possible to federate components that apply non-standard 12 http://www.gisclimatechange.org. services (e.g. THREDDS/OPenDAP servers) and GEOSS Special
  9. 9. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3 31 Arrangements for interoperability (e.g. GBIF). While waiting - Model Output access: OpenModeller saves the model outputs for the GEOSS Clearinghouse to provide an openly documen- in a local directory. To make them accessible we simply ted interface, an interoperability arrangement was introduced expose the directory through a Web Server. for the GBIF portal services, consisting of the introduction of a formal mapping for the GBIF data model to the ISO 19115 core Environmental and biodiversity data searching on a catalog metadata profile, and the GI-cat to GBIF service protocols service was implemented through the transparent interoper- adaptation. ability with GI-cat. 4.4. Model service 4.5. The client application OpenModeller,13 an open source Ecological Niche Modelling To implement the use scenario business logic and the user- (ENM) framework, was used as the component for processing system interface we developed a client application running in a collected data and generating future projections. It is currently Web browser environment using AJAX15 technologies. With this being developed by the Centro de Referência em Informação tool, the user is guided through the process of: 1) discovering Ambiental (CRIA), Escola Politécnica da USP (Poli), and the data (by submitting queries to GI-cat) and accessing selected Instituto Nacional de Pesquisas Espaciais (INPE) as an open- data through the GBIF and WCS/NCAR data servers; 2) creating source initiative. It is developed as a stand-alone application the model; 3) running ENM projections; 4) showing results. (OpenModeller Desktop) but the modelling kernel is accessible The user interface reflects the typical use scenario work- also through specific modules implementing external inter- flow (see Fig. 6). A different tab is dedicated to each of the four faces like SOAP and SWIG (Simplified Wrapper and Interface main operations: Data Search and Access, Model Creation, Generator).14 In our demonstration we use the SOAP server Model Projection and Output View. A fifth tab is used for module implemented as a CGI component of an Apache Web debugging. Inside each tab the respective sub-operations are Server. available through “accordion” menus whose content is The proprietary OpenModeller SOAP interface implements dynamically updated. The user interface is implemented in operations for the basic modelling activities. The main ones Javascript, XHTML and XSLT, using the JQuery library for are: graphical effects and GUI widgets. The application implementing the required business logic 1. getLayers for viewing the available environmental layers; is also implemented in Javascript. A simple SOAP client library 2. getAlgorithms for viewing the available modelling algorithms; has been developed to interact with the OpenModeller Server. 3. createModel for creating a model based on selected environ- The client application provides functions that implement: the mental layers and the provided species occurrences data; access to the required services, the building of request 4. projectModel for projecting a pre-generated model according to messages, the presentation of response messages, and the selected environmental layers (e.g. climate model outputs). interaction with the user. The interfaces to the most time demanding operations (createModel and projectModel) are implemented in an asyn- 5. Conclusions chronous way. Each operation call returns a ticket which can be used in a getProgress operation. In this paper, we have described how linking distributed At the time of the demonstration implementation we components needed for research on biodiversity conse- needed to resolve some interoperability issues for integrating quences of global climate change could be achieved. An OpenModeller SOAP Server in our framework: informatics framework was presented and discussed. This framework was successfully demonstrated at the GEO IV - Environmental data access: OpenModeller was not able to Ministerial Meeting in Cape Town, South Africa November access remotely located environmental data. Thus we 2007, as part of the GEOSS IP3 task. added the Data Uploader to retrieve the required data and The framework described in this paper is the first to make to store it in a proper local directory. ENM available to any user with a web browser and through - Occurrence data access: OpenModeller required providing web services. It is an example of an electronic scratch-book for occurrence data in the createModel request message. We data analysis, automating the steps of the workflow. Such would like to have the same approach both for environ- capabilities will be needed from the GEOSS Portal in future. mental data and biodiversity data. We solved this issue by The framework present valuable innovations such as: an uploading occurrence data in a Web folder using the Data OpenModeller service online with an AJAX client, the Open- Uploader. Then the Controller could access the required Modeller environmental and biodiversity data searching data and properly build the request message. integrated in a transparent way through the interoperability - Occurrence data format: OpenModeller required a specific with a standard catalog service (i.e. the CS-W implemented by format for occurrence data. For performances reasons the GI-cat), and the mapping of GBIF standard metadata to the ISO format translation is worked out by the Data Uploader 19115 core profile (the metadata model applied by GI-cat). during the upload. 13 http://openmodeller.sourceforge.net. 14 15 http://www.swig.org/. http://www.w3schools.com/ajax/default.asp.
  10. 10. 32 E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 23–3 3 The tests of this framework demonstrated the need for Elith, J., Graham, C.H., NCEAS Species Distribution Modelling international standards to support interoperability and the Group, 2006. Novel methods improve prediction of species' distributions from occurrence data. Ecography 29, effectiveness of establishing Special Arrangements for inter- 129–151. operability where these standards are not fully supported or GEO, 2005. In: Battrick, Bruce (Ed.), Global Earth Observation extended. This is important to establish crosswalks among System of Systems (GEOSS) 10-Year Implementation Plan. ESA heterogeneous information communities. Distributed and Publications Division, The Netherlands. ISBN: 92-9092-495-0. mediation catalog services, implementing a broker approach, ISSN No.: 0250-1589. proved to be a good solution to managing the complexity in GEO, 2007–2009. Work plan. Towards convergence. 30 pp. Group on multi-disciplinary or federated systems, like GEOSS. Earth Observations, Geneva. Harte, J., Ostling, A., Green, J.L., Kinzig, A., 2004. Biodiversity The framework helped validate GEOSS' initial infrastruc- conservation: climate change and extinction risk. Nature 430 ture, contributing and linking systems. GBIF and IPCC 2 p following 33. components were registered in the architecture registers. Hobern, D., Saarenmaa, H., 2005. GBIF data portal strategy. 40 pp. GBIF REST-based services were submitted as Special Arrange- GBIF Secretariat. http://circa.gbif.net/Public/irc/gbif/dadi/ ments to the GEOSS Standard and Interoperability Forum (SIF) library?l=/architecture/portal_strategy_1/. (Khalsa et al., 2007a,b). Other legacy interfaces, characterizing IPCC, 2007. Summary for policymakers. in: climate change 2007: the physical science basis. In: Solomon, S., Qin, D., Manning, the resource provider components, were assessed to be M., Chen, Z., Marquis, M., Averyt, K.B., Tignor, M., Miller, H.L. reconsidered on the basis of international standards. (Eds.), Contribution of Working Group I to the Fourth The “IP3 Mediator”, based on the GI-cat technology, will Assessment Report of the Intergovernmental Panel on Climate become a component of the GEOSS services architecture. This Change. Cambridge University Press, Cambridge. United service is able to query and access GBIF data through a Kingdom and New York, NY, USA. standard OGC CS-W interface; queries are allowed by area, Kerr, J.T., Kharouba, H., Currie, D.J., 2007. The macroecological time interval, taxa, data sources, and free text keywords. contribution to global change solutions. Science 316, 1581–1584. Another important lesson learned was the need to include Kerr, J.T., Ostrovsky, M., 2003. From space to species: ecological modelling tools in the resources managed by GEOSS. applications for remote sensing. Trends in Ecology and The framework components described here do not yet Evolution 18, 299–305. make use of the GEO Portals, as they were not available at the Khalsa, S.J., Nativi, S., Shibasaki, R., Ahern, T., Rainer, J.M., 2007a. time when this work was done (the 1st half of 2007). This The GEOSS Interoperability Process Pilot Project, EGU interoperability topic will be developed in the next future. In proceedings, Vienna (Austria), 15–20 April 2007. fact, the IP3 framework will be extended and its multi- Khalsa, S.J., Nativi, S., Shibasaki, R., Ahern, T., Thomas, D., 2007b. The GEOSS Interoperability Process Pilot Project, IGARSS '07, disciplinary capabilities will be strengthened, demonstrating Barcelona (Spain), July 2007. the impact of local Climate Change on Biodiversity (2008– Khalsa, S.J., Nativi, S., Geller, G., submitted for publication, The 2009). GEOSS Interoperability Process Pilot Project (IP3), Submitted to In our opinion, this pilot framework and its successful IEEE TGARS Special Issue on Data Archiving and Distribution. implementation demonstrate the importance of GEOSS for Kharouba, H.M., Algar, A., and Kerr, J.T., in press. Historically efficient macroecological research. calibrated predictions of butterfly species' range shift using global change as a pseudo-experiment. Ecology. Nativi, S., Bigagli, L., Mazzetti, P., Mattia, U., Boldrini, E., 2007a. Discovery, query and access services for Imagery Gridded and Acknowledgment Coverage Data: a clearinghouse solution. IGARSS '07, Barcelona (Spain), July 2007. We thank Siri Jodha Khalsa, leader of the IP3 initiative, for his Nativi, S., Mazzetti, P., Saarenmaa, H., Kerr, J., Kharouba, H., contributions to this work. Ó Tuama, É., Singh Khalsa, S.J., 2007b. Predicting the impact of climate change on biodiversity — a GEOSS scenario. GEO Ministerial IV Plenary, Cape Town, 29–30 November 2007. The REFERENCES Full Picture 2007 GEO Book. 262–264. Tudor Rose, Leicester, UK. OGC, 2005. OpenGIS® Web Coverage Service (WCS) Implementation Specification, Ver. 1.0 (Corrigendum) (1.0.0), Bigagli, L., Nativi, S., Mazzetti, P., Villoresi, G., 2004. GI-Cat: a web OGC 2005 document N. 05-076. service for dataset cataloguing based on ISO 19115. Proc. of 1st OGC, 2007a. OpenGIS® Catalog Services Specification, Ver. 2.0.2, International Workshop on Geographic Information OGC 2007 document N. 07-006R1. Management (GIM'04) — 15th International Workshop on OGC, 2007b. Catalogue Services Specification 2.0.2 — ISO Database and Expert Systems Applications (DEXA'04). IEEE Metadata Application Profile, Ver. 1.0.0, OGC 2007 document Computer Society Press. ISBN: 0-7695-2195-9, pp. 846–850. N. 07-045. Bigagli, L., Nativi, S., Mazzetti, P., 2006. Mediation to deal with Peterson, A.T., Sanchez-Cordero, V., Soberon, J., Bartley, J., information heterogeneity — application to Earth System Buddemeier, R.H., Navarro-Siguenza, A.G., 2001. Effects of Science. European Geosciences Union, Advances in global climate change on geographic distributions of Geosciences, vol. 8, pp. 3–9. SRef-ID: 1680-7359/adgeo/2006-8-3. Mexican Cracidae. Ecological Modelling 144, 21–30 www. Buckley, L.B., Roughgarden, J., 2004. Biodiversity conservation: specifysoftware.org/Informatics/bios/biostownpeterson/ effects of changes in climate and land use. Nature 460 2 p Petal_EM_2001.pdf. following 33. Peterson, A.T., Ortega-Huerta, M.A., Bartley, J., Sanchez-Cordero, V., Canhos, V.P., Souza, R., de Giovanni, R., Canhos, D.A.L., 2004. Soberon, J., Buddemeier, R.H., Stockwell, D.R.B., 2002. Global biodiversity informatics: setting the scene for a “New Future projections for Mexican faunas under global climate World” of ecological modelling. Biodiversity Informatics 1, change scenarios. Nature 416, 626–629 www.specifysoftware. 1–13. org/Informatics/bios/biostownpeterson/Petal_N_2002.pdf.
  11. 11. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 23–3 3 33 Santana, F.S., Siqueira, M.F., Saraiva, A.M., Correa, P.L.P., 2008. A Phillips, O.L., Williams, S.E.,, 2004. Extinction risk from climate reference business process for ecological niche modelling. change. Nature 427, 145–148 (8 January). Ecological Informatics 3 (1), 75–86. White, P.J., Kerr, J.T., 2007. Human impacts on Thomas, C.D., Cameron, A., Green, R.E., Bakkenes, M., Beaumont, environment–diversity relationships: evidence for biotic L.J., Collingham, Y.C., Erasmus, B.F.N., Ferreira de Siqueira, M., homogenization from butterfly species richness patterns. Grainger, A., Hannah, L., Hughes, L., Huntley, B., van Jaarsveld, Global Ecology and Biogeography 16, 290–299. A.S., Midgley, G.F., Miles, L., Ortega-Huerta, M.A., Peterson, A.T.,