GEON Research Working Group at Virginia Tech
Principal Investigator
o   A.K. Sinha …………………………… Department of Geosciences, ...

Project Summary                                                                           5
GEON Research at Vir...
Section I. Project Summary                                Two testbed regions, the mid-Atlantic and
Section II. GEON Research at                         dynamic earth processes. The linkages between
In summary, the our research activities
                                                      would provide the community ...
these workflows to design, execute, monitor,
and communicate their analytical procedures               The relationship of...
Figure 3.2: A proposed web interface of the igneous rocks portlet

Figure 3.3: Workflow diagram that highlights the dif...
The third component of the workflow
represents the tool box section. In this part of
Rock Explorer, the user is able to ac...
Home       Igneous Rocks       Metamorphic Rocks         Sedimentary Rocks       GEON TestBeds
Figure 3.5: Workflow diagram that shows how ontologies related to igneous rocks will be accessed

    This part of the sec...
portal to data concerning a sample’s modal data,      geodatabase will enable the scientist to compare
texture, fabric, fr...
connection between this database and the GEON           in section III of this paper must access the
workflow and toolbox....
resource to evaluate deposits of economic            2001). We emphasize the need to develop
minerals like tin, copper, tu...
The class diagram also presents the                   number of ontologies have already been
inheritance hierarchy that il...
Norm       is   a    calculated    mineralogic
                                                      composition based on ...
and/or installing any tool. It also makes these
tools platform independent.
   The Virginia Tech igneous rock database
moving the mouse over that polygon. This was
incorporated for better visualization of the            The SVG diagrams are ...
be platform independent. The only software                        science questions through better visualization
departures from general run of data: a pair of   Data mining is a highly interdisciplinary
   variables that have a partic...
and oceanic arcs. For this purpose the data sets
     As a first step, the rock data collected from     belonging to all t...
Virginia, 11) deploy tool for calculating shape of
                                                      pluton that best ...
Figure 8.2: Shows the workflow of querying information and the output format selection process

web.stanford.edu/KSL_Abstracts/KSL-93-04.ht       Ramakrishnan, N., & Bailey-Kellogg, C.K.,
ml                            ...
1. Cox, K.G., Bell, J.D., and Pankhurst, R.J.,       12. Pearce, J.A., and Cann, J.R., 1973. Tectonic
     1979. The inter...
Planetary Science Letters, v. 50, Issue 1, p.

Upcoming SlideShare
Loading in …5

Cyberinfrastructure Research at Virginia Tech, July 2004 ...


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cyberinfrastructure Research at Virginia Tech, July 2004 ...

  2. 2. IGNEOUS ROCKS, TERRANES AND CRUSTAL EVOLUTION: SOLID EARTH SCIENCE CYBERINFRASTRUCTURE RESEARCH AT VIRGINIA TECH PROJECT PLAN JULY 2004 Contact: A.K. Sinha Department of Geoscience Virginia Tech Blacksburg, VA 24061 E-mail: pitlab@vt.edu Report available at http://pitlab.geol.vt.edu 2
  3. 3. GEON Research Working Group at Virginia Tech Principal Investigator o A.K. Sinha …………………………… Department of Geosciences, Virginia Tech Other Participants o Cindy Stover………………………….. Department of Geosciences, Virginia Tech Graduate Student Participants o Ari Mitra………………………………. Department of Geosciences, Virginia Tech o Alex Zendel…………………………… Department of Geography, Virginia Tech o Amine Chigani………………………... Department of Computer Science, Virginia Tech o Jihane Najdi…………………………… Department of Computer Science, Virginia Tech o Matt Phillip……………………………. Department of Computer Science, Virginia Tech o Raghavendra Nyamagoudar…………… Department of Computer Science, Virginia Tech o Satish Tedapelli………………………... Department of Computer Science, Virginia Tech Undergraduate Participants o Andrew Owusu-Asiedu……………….. Department of Business Info Tech, Virginia Tech o Ryan Walker…………………………... Department of Geosciences, Virginia Tech o Keely Larson………………………….. Department of Geosciences, Virginia Tech o All Other Past Participants o Boyan Brodaric……………………….. Geological Survey of Canada o Murray Journay……………………….. Geological Survey of Canada o Calvin Barnes…………………………. Texas Tech University o Art Snoke……………………………… University of Wyoming o Clinton Smyth…………………………. GeoReference Inc., Vancouver o Naren Ramkrishnan……………………. Department of Computer Science, Virginia Tech o And the research team at the San Diego Super Computer Center 3
  4. 4. CONTENTS Project Summary 5 GEON Research at Virginia Tech 6 General Workflow of Rock Explorer 7 1. What is Workflow? 7 2. Introduction to Rock Explorer 8 3. General Workflow of Rock Explorer 8 4. Component Specifications 8 5. Product Prospective 10 The Igneous Rock Database 12 1. Introduction and General Description 12 2. Storing data that describe geologic bodies and sample locations 12 3. Handling Metadata 13 4. Handling heterogeneous data sources 13 5. Digital Images of Samples 13 6. Connectivity of this database to the overall GEON system 13 7. Reference Dataset 14 Ontology & Earth Sciences 14 1. What is Ontology 14 2. The Need for and Use of Ontologies 14 3. Technical Aspects of Ontology Development Section 15 Tool Box 17 1. Introduction 17 2. Implementation in Java 17 3. Tool Categories 17 A. Calculation 17 a. Example: Norm Calculation 17 b. Example: Mineral Formula Calculation 17 B. Classification 18 a. Example: Use of SVGs and Igneous Rocks 18 C. Modeling 19 a. Partial Melting Modeling 19 4. Application and Use 19 5. Classification Diagrams and Tools for Igneous Rocks 20 Data Mining in Earth Sciences 20 1. What is Data Mining 20 2. Components of Data Mining Algorithms 20 3. How is data mining different from statistical analysis 21 4. Ontology driven data mining research in solid earth science at Virginia Tech 21 Information Integration Scenario 22 1. A-Type Integration Scenario 22 Summary Statement 23 References 24 Rock Classification Diagrams Reference 25 4
  5. 5. Section I. Project Summary Two testbed regions, the mid-Atlantic and the Rocky Mountains, have been identified to The GEON (GEOscience Network) research define the GEON geoscience challenges, though project is responding to the pressing need in the the system will be able to accommodate national geosciences to interlink and share and international research activities. These multidisciplinary data sets to understand the testbed regions were selected due to the variety complex dynamics of Earth systems. To rise to of geological issues embodied within them this challenge, we have formed a coalition of IT requiring interlinking of multiple disciplinary researchers, representing key technology areas, databases and also because they are areas of and Earth Science researchers, representing a expertise for the GEON geoscience research broad cross-section of Earth Science sub- team. The results of GEON research will disciplines. The need to manage the vast significantly impact large multi-scale geoscience amounts of Earth science data was recognized research programs such as Earthscope, as well as through NSF-sponsored meetings, which gave individuals and smaller groups of researchers, birth to the Geoinformatics initiative. The thereby leading to an intellectual transformation creation of GEON will provide the critical initial of the entire science. Recognizing this potential, infrastructure necessary to facilitate the U.S. Geological Survey has joined as a Geoinformatics and other research initiatives, major partner and has made creation of key such as EarthScope. GEON databases a priority effort over the next several years. Via DLESE, GEON will become an important resource for sharing knowledge Creating the GEON cyberinfrastructure to about the Earth for a variety of audiences, integrate, analyze, and model 4D data poses including K-12 students and teachers. fundamental IT research challenges due to the extreme heterogeneity of geoscience data formats, storage and computing systems and, Many disciplinary geoscience database most importantly, the ubiquity of "hidden projects are already underway, indicating the semantics" and differing conventions, readiness of the community to participate in terminologies, and ontological frameworks such a national-scale effort. As NSF and other across disciplines. GEON IT research focuses on agencies begin to invest in these databases modeling, indexing, semantic mediation, and creation efforts, a need for a cyberinfrastructure visualization of multi-scale 4D data, and that will enable integration of these databases creation of a prototype GEONgrid, to provide becomes imminent. Various GEON-like grid the geoscience community an IT head start in efforts, such as GriPhyN, NEESGrid, and BIRN, facing the research challenges posed by have all indicated the readiness of the IT understanding the complex dynamics of Earth community to provide the necessary systems. An important contribution will be interoperable infrastructure, and testify to the embarking on the definition of a Unified value of integration of IT with major science and Geosciences Language System (UGLS), to education initiatives. It is now the opportune enable semantic interoperability. The GEONgrid moment to start the GEON program, to herald leverages experience gained in the National the geosciences into the era of Geoinformatics Partnership for Advanced Computational and accelerate geoscience research in a timely Infrastructure (NPACI) program, and the manner. In sum, GEON is an IT-based TeraGrid Distributed Terascale Facility. We will Geoscience revolution that will play a critical create a portal to provide access to the GEON role in a more holistic understanding of the environment, which will include advanced query dynamics of Earth systems. It will also create interfaces to distributed, semantically-integrated new scientific paradigms and renew the databases, Web-enabled access to shared tools, excitement in the community of the post-plate and seamless access to distributed tectonics era. computational, storage, and visualization resources and data archives. 5
  6. 6. Section II. GEON Research at dynamic earth processes. The linkages between crustal instability enhanced by thermal Virginia Tech perturbations as recorded by plutons is often preserved in exposed roots of mountain belts, Studies of magmatic processes in geoscience and the proposed database constructed on research provide fundamental information on the igneous rocks (plutonic rocks) through time physical and chemical evolution of continents within the Appalachian orogen will not only through time. Diverse tectonic processes of yield correlations between other geologic collision, extension, and transpression leave a processes (e.g. style of thrusting, basin rich geologic record saturated with igneous development, cooling rates, paleogeothermal rocks. Plutons often provide our only direct gradients, tectonic setting) but also provide a monitor of the thermal budget of a region, and firm basis for a national database leading to a they also yield information about uplift histories digital earth model. and paleogeophysical properties of rocks through time. The utilization of such critical existing knowledge and information in the geosciences to conduct comprehensive interdisciplinary research is always hampered by the lack of organizational structures, despite such data being already available in the literature. In our ongoing research we will construct a geospatially-referenced information system on plutons, develop web-based information access mechanisms, and collaborate with researchers conducting similar efforts. With these collaborative efforts we will be able to Figure 2.1: Geologic map of the central help form a comprehensive geoscience Appalachian orogen (compiled from State information system for research and education Geological Surveys) purposes. The Appalachian Orogen is a continental This research program envisions constructing scale mountain belt that provides a geologic a field based database on plutons and their ages template to examine the growth and breakup of to discover new relationships between continents through plate tectonic processes. The magmatism, crustal evolution and other record spans a period in excess of 1000 million disciplinary geologic records preserved in both years, and preserves the only known example in space and time. It is very likely that future the world of a completely preserved one and a research on integrating correlations between half Wilson Cycle (the opening and closing of evolution of sedimentary basins, oceans). Complex assembly of plates through extensional/compressive tectonism to collision can be recognized in the rock record of metamorphism, and the geophysical signatures the mid-Atlantic Appalachian orogen through of such varied processes is only possible through time. It constitutes a great scientific challenge recognition of the thermal budget of the for earth scientists to readily and clearly identify crust/mantle through both space and time. It data for the critical separation of overprinted must also be emphasized that our understanding processes involved in the creation of mountain of near surface processes is also closely linked belts. The Paleozoic Era in the mid –Atlantic to information of the thermal interplay between region provides evidence of multiple collisional deep crust and mantle. Therefore all geologic events (Taconic, Acadian and Alleghanian), the data that contributes to this connection of cause and effects of which are the subject of “Planetary Energetics and Dynamics “(NSF numerous leading research activities. Geosciences Beyond 2000) must be at the forefront of databases designed to study 6
  7. 7. In summary, the our research activities would provide the community an opportunity to address scientifically unique questions that will arise through integrating the proposed database with other disciplinary databases (specifically geochronology, metamorphism, experimental petrology, structural geology, stratigraphy, magnetic, gravity, seismic, geothermal) as a function of space and time, thus leading eventually to a 4-D visualization of the thermal – mechanical evolution of the continental crust (of significance to the EARTHSCOPE initiative), and most importantly, providing the Figure 2.2: Regional distribution of igneous rocks next generation of geoscientists a valuable in the central Appalachian orogen compiled by educational and research tool. Virginia Tech. The extensive distribution of igneous rocks with associated databases facilitates discovery of new hypotheses for the origin of Figure 2.3 shows the plan to systematically terranes and accretionary orogens. build the cyberinfrastructure, including both object and process ontologies, to facilitate a In order to develop an IT based more robust integrated understanding of crustal understanding of the cause and effects of evolution. geological processes associated with crustal evolution, our research in GEON will focus most of its resources towards developing an integrated view of crustal evolutionary processes represented in the Appalachian orogen. Some Key Scientific Questions in the Appalachian orogen which are likely to utilize information from the igneous rock record: • How do recognizable events (deformation, metamorphism, magmatism, sedimentation) relate to one another during collision, during extension, and during exhumation? How do Figure 2.3: Progressive development of the we link or relate such events in the broader cyberinfrastructure with emphasis on igneous plate tectonic picture? rocks, time and crustal evolution. • How do we identify terranes that may be involved in multiple orogenies and ascertain Section III. General Workflow of their role in orogeny? Rock Explorer • What are the geologic scenarios for thin skinned terrane accretion? • What is the relationship between rheology 1. What is Workflow? and deformation at all crustal levels? (a In this part of the paper, we use the term research recommendation of the NSF workflow to describe a series of structured sponsored workshop report on New activities and computations (Munindar P. Singh, Departures in Structural geology and Mladen A. Vouk) that arise in designing Tectonics (2003) software applications. However, workflows • What is the relationship between tectonic (especially, scientific workflows) have a broader settings and paleo-geothermal gradients? focus. Many scientists and engineers utilize 7
  8. 8. these workflows to design, execute, monitor, and communicate their analytical procedures The relationship of the database content tools with minimal effort. These workflows provide and services are represented in a high level necessary abstractions that enable the effective workflow diagram (figure 3.3). This diagram communication between domain agents and IT shows the various research activities being expertise. conducted at Virginia Tech. 2. Introduction to Rock Explorer 3. General Workflow of Rock Explorer Rock Explorer is a prototype web-based The workflow (figure 3.3) highlights the portal that is being designed by Virginia Tech to different components of Rock Explorer. Each allow the user to explore national and component represents a research area that international databases, tools and ontologies for Virginia Tech’s team is serving within the different types of rocks: igneous, metamorphic GEON project. There are four main nodes to this and sedimentary. This portal will also permit prototype web application: Databases, access to information about rocks from the two ontologies, tools, and data mining. In addition, GEON testbeds (Mid Atlantic and Rocky this tool presents a prototype solution to an Mountains) as well as other parts of North integration scenario that shows how America. Figure 3.1 is a proposed web page geoscientists can use ontologically based queries layout of the portal page for Rock Explorer. to extract information from databases, and use their findings to develop new knowledge. GEON Rock Explorer Home Igneous Metamorphic Sedimentary Test 4. Component Specifications rocks Rocks Rocks Beds First, databases section of Rock Explorer Goal Statement: allows access to databases related to igneous rocks. This section directly accesses Virginia Access, Analysis, Visualization, Tech’s igneous rock data warehouse. In and Modeling of Rock Data addition, it provides links to reference rock databases available on the web. The data accessed is, then, available to be downloaded, analyzed, or processed inside the tool box Figure 3.1: A proposed web layout of the portal section (Figure 3.4). page Second, Rock Explorer makes many levels of In figure 3.2 we show our proposed web page igneous rock ontologies that have been interface to the igneous rocks portlet. This page developed by Virginia Tech available for provides access to databases, ontologies, and analysis and modification, if required, to make data analysis tools related to igneous rocks. integration more robust. For readability Icon based access to regional igneous rock purposes, these ontologies are represented in database (i.e. Mid Atlantic Region) as well as to different formats; some are class diagrams, other reference databases will be possible. This others are tree-structures. Similar to raw data interface will also host access to animations that sets, these ontologies are available to the show the distribution of igneous rocks as a community for modifications or additions of function of time and other attributes. We new ontologies. It is anticipated that the San envision this activity to provide the end user a Diego Supercomputer Center will be the geospatial map (with zooming capabilities) guardian of these ontologies as they are where connectivity to databases can be developed by the earth science community established by a clicking on a polygon. For (Figure 3.5). example, information regarding name, age, records of rock and mineral analyses as well as images contained in the database can be highlighted through the use of a pop-up box. 8
  9. 9. Figure 3.2: A proposed web interface of the igneous rocks portlet Figure 3.3: Workflow diagram that highlights the different components of Rock Explorer. Each component represents a research area being investigated at Virginia Tech 9
  10. 10. The third component of the workflow represents the tool box section. In this part of Rock Explorer, the user is able to access and apply different data analysis tools to their data. In figure 3.6, the Tool Selector actor facilitates choosing a tool based on three major categories: calculation, classification, or modeling. Under each of these categories, the user will find different analysis tools; several of which have been already developed and made available by Virginia Tech. More extensive visualization tools will be available through the GEON portal. Others can be readily implemented and added as Figure 3.7a: Workflow diagram of modal this prototype application evolves, and emerges classification for naming igneous rocks into a national cyber infrastructure project. Based on the tools selected, the Output Selector Inside the classifier (Figure 3.5b), the modal actor decides the output formats based on the classification is divided into finer descriptive results of the selected tool and/or the user format levels. At each level, a diagram is chosen preferences (discussed later in tool box section). according to the region of the classification diagram from the previous level, and the (new) classification diagram yields a sub-type category name for the rock Figure 3.6: Workflow diagram showing types of data analysis tools developed at Virginia Tech and Figure 3.7b: Workflow representation of available for web based application. Some components inside Classifier actor in figure 3.5a classification related activities have already been implemented by the research team at SDSC 5. Product Prospective For demonstration purposes, figure 3.5a and The Rock Explorer workflow represents our 3.5b (from Efrat Jaeger, San Diego vision of how a geoscientist would like to use Supercomputer Center) illustrate the workflow these data analysis tools in a web based for modal classification which utilizes mineral environment. It also serves to provide a platform abundances to assign a rock its name. Figure for collaborative research with San Diego 3.5a shows that the mineral abundance data for a Supercomputer Center team (SDSC). Our given ssID is extracted from the modalData intention is to provide a prototype which will be database and is sent to the classifier along with utilized by SDSC to develop the cyber the appropriate diagram. The result is a name for infrastructure for the earth sciences. the rock, which can be utilized for additional queries. GEON Rock Explorer – Page 3 10
  11. 11. Home Igneous Rocks Metamorphic Rocks Sedimentary Rocks GEON TestBeds Mid Atlantic Rocky Mountain Other Regions Reference Databases: Click on the links below to go to reference databases. Virginia Tech Igneous Rock Database GEOROC Database Other Databases Mid Atlantic Region Database Files: Click on a file link to access/download the file. All files are in Microsoft Access format. WholeRock_GeoChemistry Table Texture Table Minerals Table Fracture Table Isotope and Radiometric Fabric Table ModalData Table Inclusions Table PublisheAgesandInitial Figure 3.4: Shows a draft of the web page layout of the databases section of Rock Explorer 11
  12. 12. Figure 3.5: Workflow diagram that shows how ontologies related to igneous rocks will be accessed This part of the section describes the mineralogy and radiometrics of igneous rock properties of the tools used to design the samples, and the geological bodies, enclaves, workflow: fractures, fabrics, textures and inclusions to Ptolemy II: Most of the diagrams in this section which they are geologically and geographically were created using the Ptolemy II Version 4.0- related. Our geodatabase system also accesses Beta software. This software framework was images of geologic samples taken at varying developed as part of the Ptolemy project at scales. Hand samples, thin sections and University of California at Berkeley (Electrical individual images are linked to spatially Engineering & Computer Science). It is a Java- documented samples and provide the end user a based-component assembly framework with a means to explore igneous rocks from outcrop to graphical user interface called Vergil. The microscopic level. Ptolemy project studies modeling, simulation, and design of concurrent, real-time, embedded 2. Storing Data that Describe Geologic systems. The project is named after Claudius Bodies and Sample Locations Ptolemaeus, the second century Greek The primary function of this database is to astronomer, mathematician, and geographer. For store data that describes geological bodies and more information refer to the Ptolemy Project samples acquired in the field. Hence, the web page. “GeologicBodyAndLocation” and “Sample” Kepler: Kepler is a unique system that tables are central to the database and provide a combines high-level workflow design with unique identifier for every geologic entity. execution and runtime interaction, access to Using a geologic body’s “BodyID”, tabular data local and remote data, and local and remote describing its geometry and enclaves can be service invocation. The development of Kepler accessed. This BodyID also links to polygon was based on the dataflow-oriented Ptolemy II GIS layers mapped at three scales ranging from system. It inherits many advanced features from 1:500,000 to 1:24,000. The Ptolemy, and numerous extensions and features “PublishedAgesAndInitial” table, which stores have been added recently for supporting the data regarding the age of bodies as derived by scientific workflows. Kepler is a collaboration geologist using a variety of methods, also between computer and domain scientists with contains a BodyID. Through this link, the SEEK project (http://seek.ecoinformatics.org), numerical age of a body can be displayed on a the GEON project (http://www.geongrid.org), GIS layer as annotation, or the symbology of the the Ptolemy II software project GIS layer may be based on these ages, (e.g. (http://ptolemy.eecs.berkeley.edu), and the SDM older igneous bodies could be displayed with Center (http://sdm.lbl.gov/sdmcenter). It is a darker shades of blue and younger bodies could cross-project, open source activity, with an be displayed with lighter shades). active and growing community of developers and users. Initially, every sample that is stored in this database is registered in the “Sample” table where it receives a unique identifier, its “ssID”. Section IV. The Igneous Rock This ssID provides a direct link to data stored in Database over 16 other tables and 4 GIS feature classes. Geochemical data, extracted from both bulk rock 1. Introduction and General Descriptions and mineral analyses, are stored in the This database schema was designed utilizing “WholeRock_Geochemistry” and “Minerals” concept maps for igneous rocks and attributes tables, respectively. Depending on the particular associated with intrusive igneous bodies isotope that was analyzed, data generated from (plutons). Based on the attributes, we have isotope and radiometric analyses are stored in 1 developed a spatially enabled database schema of 6 radiometric tables shown in the bottom right that organizes data describing the geochemistry, corner of figure 4.2. The ssID also serves as the 12
  13. 13. portal to data concerning a sample’s modal data, geodatabase will enable the scientist to compare texture, fabric, fractures, fluid inclusions, and and contrast these uploaded data to data that melt inclusions. The samples are georeferenced were previously shared by others in the GEON in the “SamplePoints” GIS feature class. Hence, community. the interconnectedness of the ssID will allow all of the data in these tables to be displayed geographically. For example, a middleware tool such as Isoplot may read data from one of the 6 radiometric tables and compute numeric ages. The output of this computation may be stored in the “PublishedAgesAndInitial” table and then displayed geographically via the web-based GEON GIS map viewer. Thus, this database provides a means for geologists to explore the geotemporal nature of their data, as well as the data provided by other GEON participants. 3. Handling Metadata This database schema also contains tables that house metadata regarding geologic bodies and samples extracted from them. The Figure 4.1: Demonstration of the spatial-temporal “References” table contains information about capabilities in the Virginia Tech Igneous Rock the source of the data stored in other tables; all Database of which are directly linked to most of the other tables via a foreign key. This information 5. Digital Images of Samples includes the authors, article title, year of This geodatabase system also accesses publication and the journal in which it is found. images of geological samples captured at The “AnalyticalMethods” table describes the varying scales. Thin section images can be methods used to extract the data that are housed shown in their entirety as viewed with the naked in other tables. For example, the geochemical eye as well as microscopic images of mineral data contained within the whole-rock crystals. Because these images are linked to geochemistry table may have been obtained georeferenced samples, these images coupled using ICPMS, Electron Microprobe (EMP) or with this database structure provide geologists multiple methods. The methods used in each with a means to explore the earth from global to individual analysis (record) are documented via microscopic scales. a foreign key that is directly linked to the “AnalyticalMethods” table. 6. Connectivity of this Database to the Overall GEON System 4. Handling Heterogeneous Data Sources Finally, this database will be interconnected To allow geoscientists to easily share their to the Igneous Rock Ontology, which is data that may be stored in a variety of database discussed in a more detail in the following systems with heterogeneous and inconsistent section of this paper. This interconnectedness table structures, data conversion utilities must be will allow geologist to access data residing in developed so that a geologist’s data can be this database using ontologically-driven, text- successfully appended to the geodatabase based query. For example, if a geoscientist discussed here. Once these data are uploaded wishes to examine the isotope data, the ontology into the eventual GEON implementation of this system can direct the request to the appropriate georelational database, the scientist can explore tables in the database; in this example these his/her data via the many GEON tools, such as tables would be 6 radiomentric tables in Figure rock classifiers and statistical analyses. More 4.1 and the “PublishedAgesAndInitial” table. importantly, this ontologically integrated The ontology system will also serve as 13
  14. 14. connection between this database and the GEON in section III of this paper must access the workflow and toolbox. The classifier described correct table in the database and then obtain data Figure 4.2: Igneous Rock Database Schema utilized by Virginia Tech researchers from the appropriate fields within it. For the (major and trace elements) analysis from integration scenario discussed in section II to different type localities of the world. Our function correctly, the ontology must be able to database on granites utilizes their genesis/source direct the classifier to the ModalData table and for subdivision into four main types: A, S, I and extract values from the geochemical fields that M. This letter classification is widely accepted are necessary to plot the data on the appropriate by the scientific community and is used to classification diagram. represent granites from various geologic environments. 7. Reference Dataset A reference dataset for granites is being The examination of these granites through compiled to augment the existing database of various classification schemes (e.g., Whalen, et igneous rocks as part of providing these data al., 1987, Chappell & White, 2001, and Eby, through a web interface for the scientific 1990) help geologists in understanding community. This dataset has been compiled in fundamental rock forming processes. The Microsoft Access protocol following the igneous chemical parameter in these classification rock database schema (developed by the schemes are indicative of the tectonic Virginia Tech GEON team) from published environment (Pitcher, 1983) as well as constrain literature and from contributions primarily from their genesis from various sources e.g. mantle or J. B. Whalen of the Geological Survey of crust. This in turn helps us develop our Canada. This reference dataset presently consists understanding of crustal evolution Availability of over 430 records of whole rock geochemical of such reference databases provides a needed 14
  15. 15. resource to evaluate deposits of economic 2001). We emphasize the need to develop minerals like tin, copper, tungsten and ontologies at multiple levels of granularity to molybdenum as they are commonly associated facilitate exploration of data sources using an with certain types of granites. ontologically- structured approach. The user will have the option of accessing Ontologies are also used as a computational the Virginia Tech Reference Database as well as path way for answering queries. Given a query, the existing databases like GEOROC, USGS we can search for a set of appropriate paths in (Pluto database), PetDB, etc. directly through the ontologies and retrieve information from the the web and thereby enhancing web querying target resources corresponding to the found path capabilities and easing the process of comparing (Lu and Hsu, 2003). For instance, the integration data from different datasets. This reference scenario described in section II, and which will dataset on granites is also geared to help achieve be discussed in detail in section VIII, will the integration scenario discussed in Section implement the igneous rock ontologies to VIII. answer geological queries. Therefore, this method makes details of querying web resources Section V. Ontology & Earth clear to the geologist, and allow them to seek and utilize geological resources on the web more Sciences efficiently. We also recognize that schema-based queries of databases access only the object view, 1. What is Ontology? but a knowledge base requires the application of In order for an agent to ask queries and make both process and object ontologies. statements about a subject domain, a conceptualization of that domain needs to be 3. Technical aspects of Ontology Development described. Ontologies, which are explicit In order to use our ontologies in a web-based specifications of domain conceptualizations system environment, we have developed these (Gruber and Thomas, 1993), describe the entities ontologies using the Web Ontology Language in a subject domain, relations among them, as (OWL). OWL is a language for defining and well as the processes and functions that apply to initiating web ontologies (Smith, Welty, and them (Farquhar, Fikes, Pratt, and Rice, 1995). McGuinness, 2004). OWL ontology might One of the most important goals in developing include description of classes, properties, ontologies is sharing common understanding of instances of classes and relationships between the structure of information among people or these instances (Smith, Welty, and McGuinness, software agents (Noy and McGuinness, 2001). 2004). Classes describe concepts in the domain. Other important reasons include: “enabling Any class can also have subclasses which reuse of a specific domain knowledge in other represents concepts that are more specific than domains, making domain assumptions explicit the superclass (Noy and McGuinness, 2001). so that they can be changed as knowledge about In figure 5.1, we show the class diagram of the the domain changes, separating domain high level conceptual model that exists between knowledge from operational knowledge, and a concept Body and its constituent: rocks, analyzing domain knowledge” (Noy and structural features, etc. Specific measurements McGuinness, 2001). of attributes elements (single or multi-valued) taken in the field or laboratory are inter-related 2. The Need for and Use of Ontologies to each other through relationships and In earth sciences, data sources constitute not cardinality. only of databases, but also analysis tools For example, the class Solid is related to class (section VI) used to analyze the information Rock through a “consists of” relationship, as contained over these databases. Ontologies help well as an explicit cardinality expression that to manage the interoperation between these one Solid body may consist of multiple types of different resources (Goble, Stevens, Ng, rocks. Similarly, an instance of class Rock Bechhofer, Paton, Baker, Peim, and Brass, contains minerals and chemical elements. 15
  16. 16. The class diagram also presents the number of ontologies have already been inheritance hierarchy that illustrates the class- developed; for instance, the concept of unit, subclass concept in relation to igneous rock space, earth layer (SWEET, 2004). However, ontology. An instance of the class Body little research has been done to create the represents any substance in one of the two ontologies for other concepts that are equally states: solid or liquid. Based on some properties important to earth sciences. Furthermore, the associated with a Body, for example temperature existing ontologies are very generic and need to and pressure, a body can be classifies to either a be altered to communicate the need of solid or a melt. Although our ultimate objective geoscientists. Such ontologies are usually is to characterize igneous sources through limited to a specific domain, and not adaptable complete solid and melt ontologies, our current for describing other domains. research focuses on the solid igneous bodies. Class Igneous represents the concept of an We realize that these activities will yield igneous rock. Properties of this class represent prototype of ontologies, but they represent our the attributes of the concept. These properties understanding of the igneous rock domain. We could be instances of other classes. In our anticipate community participation in exploring hierarchy, the Igneous class has two subclasses: our ontologies, and providing feedback for the Plutonic and Volcanic. development of a more robust and general ontology for igneous rocks. With this approach, In figure5.1, every class represents a separate the fusion of all ontologies will result in a concept which can have its own ontology. A framework for the entire earth science. Figure 5.1: Class diagram representing object view ontology for igneous rocks 16
  17. 17. Norm is a calculated mineralogic composition based on the conversion of a whole rock chemical analysis into the formulas of Section VI. Tool Box common minerals. The oxides in the rock analysis are allocated, following a prescribed set 1. Introduction of rules, to simple end- member formulas of the Virginia Tech is contributing the Tool Box, a rock forming and common accessory minerals. set of practical tools to the GEON project to be The norm allows a chemical analysis of a rock used over the internet by geoscientists. These to be recast in terms of the common minerals by tools are written in Java and will be accessible which it can be classified. This is particularly via a Java Server Page (JSP). The Tool Box useful for rocks which are fine-grained for consists of tools of three different categories: gathering modal data (Philpotts, 1990). Calculation, Classification, and Modeling. Of the tools provided, the user can select the tool The norm calculation is implemented in java. he/she wants to use, using Tool Selector. The Some modifications (Hollocher, 2004) to this user can also select the way he/she wants the methodology have been incorporated. Our output. The color selector provides choice of current implementation reads a tab delimited colors and symbol selector provides symbol text file for sample data and calculates the choices. The output selector will facilitate the weight norms for the data. These weight norms user to choose the output format through the are outputted as a tab delimited text file. The color and symbol selectors. The user can also norm calculations are then used for rock choose the scale to be used (log, semilog, etc) classification. The web-interface of norm will for the output. take in as input a set of sample rock data or an individual data in a specific format and output 2. Implementation in Java the norm calculations as a tabular format. The In order to utilize geological data it is often user can also upload a file of sample data useful to perform numerical computation on through the web-interface to get the norm such data. Such numerical computations that calculation for that data. The geologists can use deal with classification of rocks and minerals are this web-interface to analyze their data. being implemented. b. Example: Mineral Formula Calculation The mineral data is stored in a Microsoft Feldspars, pyroxene, olivine, garnet, mica, Access database. A Java class will be used to and epidote are common minerals found in most connect to the database via a JDBC connection. igneous rocks. We have developed calculation It will then read that data and perform the tools as part of a prototype for web-based analysis. This analysis will be returned to the analysis of mineralogic data. Java Script Page that called the Java class and The input for each one of these mineral tools will be displayed via a web page to the user. is the mineral composition in weight percent. This data is used with the molecular proportion 3. Tool Categories of oxides, atomic proportion of oxygen from A. Calculation Tools each molecule, number of anions on the basis of Geologists use various computational tools to accepted numbers of oxygen atoms to calculate analyze and interpret the data. Methods for number of ions for the mineral formula. norm, radiometric age, temperature etc perform These tools will be accessible via a web various calculations on geological data and the interface and it will accept input data via a form result is used in further analysis like on an HTML webpage or through an uploaded classification and modeling. These methods are tab delimited file. Once the calculations are a part of Tool Box as calculation tools. performed by our Java program the results will be outputted via JSP. With such an interface a. Example: Norm Calculation computations can be done without downloading 17
  18. 18. and/or installing any tool. It also makes these tools platform independent. The Virginia Tech igneous rock database includes mineral compositions and can be accessed through this computation tool. For example, the data for the mineral olivine from the Baltimore mafic complex can be accessed through this tool to provide subclasses of mineral names within the olivine family. The first step in acquiring the subclasses of a sample of olivine is to select the desired data from the database. As discussed in the data section of this paper, each sample of data is uniquely identified by an ssID value. After the Figure 6.2: SVG based graphics for discussing desired data is selected as pictured in Figure 6.1 minerals of olivine group. the data is plotted as outlined in Figure 6.2. B. Classification Tools for Minerals and Rocks The sample data can be classified by rock name, mineral, tectonic settings, rock affinity etc. The various methods to classify the data are included in the Tool Box as classification tools. a. Example: Use of SVGs and Igneous Rocks GEON research at Virginia Tech is aimed at providing some of the computational tools and graphics for web based use. Current research has emphasized the coding of scalable vector graphics (SVG) for igneous rock/mineral classification. As SVG is a language for Figure 6.1: Table showing partial analysis for describing two-dimensional graphics in XML, it mineral olivine allows various types of graphics objects like text, vector graphic shapes (lines, curves, As seen in fig 6.2 the samples that have been polygons etc.) to be implemented in a web selected are classified into different olivine environment. subclasses. The graphics for representing the subclasses are discuss in the following section. The code for binary and ternary plots for the igneous rock/mineral classification is written in SVG and includes a header which specifies the dimension and orientation of the images. A few examples shown below include polygons within triangles where individual polygons represent a rock association. Every image is drawn using a SVG header, which initializes the dimension on the triangle. The triangle is drawn first using the polygon function and is annotated at the vertices. Then the polygons inside the triangle are drawn. The rock types annotate the polygons. The SVG images show the name of the polygon on 18
  19. 19. moving the mouse over that polygon. This was incorporated for better visualization of the The SVG diagrams are classified into images. These images will be used in the modal different classes: 1) Tectonic setting classifier classification tool. A demonstration workflow of for granites, 2). Element concentration rock such a classification tool is shown in Figures name classifier, 3) Tectonic setting classifier for 3.5a and 3.5b. In the modal classification tool, basalts, 4) Source classifier, 5) Modal QAPF the PointInPolygon module plots the sample rock name classifier, 6) Magma association data over an appropriate SVG diagram and the classifier, 7) Magma type classifier. Classifier module analyzes such diagrams and C. Modeling Tools classifies the sample data. This classification a. Partial Melting Modeling procedure is done in multiple levels. At each The melting of source rocks is the process of level, a SVG diagram is chosen according to the partial melting and modeling is the numerical region of the point(s)(to be classified) in the expression of that feature. By modeling the previous level and a new region for the point(s) partial melting process we are able to get an is calculated according to the transition table, insight into the source and ultimately discover region of the point(s) and their mineral info the rock’s origin. (Figure 6.5) contained in this level’s diagram. Either the The partial melting modeling tool accepts region is classified and given a rock name, or it two matrices as input; a source rock data matrix, leads to a different SVG diagram. and a melting mode matrix. These matrices can be uploaded via a webpage, or read from a database. From these two matrices and a stored mineral values matrix, the source rock bulk distribution coefficient and the melting mode distribution coefficient is calculated. Then, from these two coefficient matrices we compute enrichment factor matrices for both the source rock data matrix and the melting mode matrix for any given value of F. Finally, chondrite- normalized values are calculated from the enrichment factors matrix. The steps listed have been implemented in Java and are available for deydration melting, amphibolite melting, eclogite melting, granulite Figure 6.3 : Zr/4-Nb*2-Y melting, hydrous graywacke melting, and tonalite melting. 4. Application and Use There are various advantages of Tool Box. Currently many of the tools are available but are part of different packages and the user has to download and or/install them to use them. Many of these tools are proprietary softwares and are expensive. Tool Box will have all the tools at one place, free of cost and will provide most of the functionalities of proprietary softwares. The source code for the tools will be available to all and they can download the source and modify it as per their requirements, can add additional features or make corrections in future. As Tool Figure 6.4 : Th-Hf/3-Ta Box will have a web interface to it, the tools will 19
  20. 20. be platform independent. The only software science questions through better visualization required will be a Java Plug-in for the web and analysis. browser. Tool Box provides the advantage of ease of use. The geologists need not go through 5. Classification Diagrams and Tools for the learning curve for understanding the working Igneous Rocks of the software packages. They only need to set some parameters; pass the input (in required Table1: Classification Diagrams and Tools for Classification Images Reference Igneous Rock implemented for Web-based Tectonic Setting Classifier Nb vY 14 Research (References are in the reference section for Granites of this paper Ta v Yb 14 Rb vs (Y + Nb) 14 Rb-(Yb +Ta) 14 Hf-Rb/10-Ta * 3 22 Hf-Rb/30-Ta * 3 22 Element Concentration TAS Alkalis – Silica 5 Section VII. Data Mining in Earth Rock Name Classifier Sciences Molecular Normative Composition Na2O + K2O vs SiO2 1 1. What is Data Mining? Nb/Y vs Zr/TiO2 20 Data mining is the analysis of observational Tectonic Setting Classifier Ti vs V 16 data sets to find unsuspected relationships and to for Basalts summarize the data in novel ways. Zr-Zr/Y 13 Source Classifier A, I, S, M Ga-Al-Zr 19 The definition above refers to ‘observational Types Modal QAPF Rock Name Q-A-P-F (Plutonic 17 data sets’ as opposed to ‘experimental data’. Classifier rocks) Data mining algorithms are usually applied on Q-A-P-F (Volcanic 18 data sets collected for some purpose other than rocks) data mining analysis. So the data mining activity Magma Association Cr-Ti (OFB and 10 has no control on data collection. Classifier LKT) Zr-Ti (LKT, CAB, 12 The relationships found out using a data mining OFB) algorithm are expected to be novel. These are Zr-Ti/100-Sr/2 12 often referred to as models or patterns. These Na2O+K2O-FeOt- 3 relationships must be statistically significant MgO SiO2-Na2O+K2O 3 (not occurring merely by chance). SiO2-FeOt/MgO 8 FeOt/MgO-FeOt 8 2. Components of Data Mining Algorithms FeOt/MgO-TiO2 8 MnO*10-TiO2- 9 • Model or Pattern Structure: A model is a P2O5*10 high-level, global description of a data set. It SiO2-K2O 2 may be descriptive or inferential. SiO2-Na2O+K2O Descriptive models summarize the data in a SiO2-K2O 6 SiO2-Al2O3 6 concise and convenient way. Examples of descriptive models include models for the SiO2-FeOt/ 6 (FeOt+MgO) overall probability distribution of the data Mw%-Fw 6 (density estimation), partitioning data into Cw%-FMw% 6 groups (cluster analysis). Inferential models Magma Type Classifier ACNK-ANK 6 make a statement about the population from format) and the interface of Tool Box will take which the data were drawn or about likely care of the rest by making the software future values. Examples of inferential transparent to the user. When combined with models include regression models, mixture other components of GEON project like data models. In contrast, a pattern is a local mining, the Tool Box will help in interpretation feature of the data, perhaps holding for only of geological data towards answering key a few records or variables. Patterns represent 20
  21. 21. departures from general run of data: a pair of Data mining is a highly interdisciplinary variables that have a particularly high activity. Statistics and mathematics play an correlation, a set of records that always important role in modeling the data (Mackay score the same on some variables, and so on. 1992). Parallel processing techniques are used • Score function: This helps in judging the for handling large sets of data (Maniatty et al quality of a fitted model. Given a data set, 2000). Visualization is essential to better there might be many possible models that understand the numerical output produced by the can describe the data set. The purpose of a data mining algorithms (Thearling et al 2001). score function is to rank the models. Examples of score function are mean 3. How is Data Mining Different from squared error, least squares principle. Statistical Analysis? • Optimizing and Search method: The data The key difference between statistics and mining algorithm tries to optimize a score data mining algorithms is that statistics is function by searching over different concerned with primary analysis: the data are models and pattern structures. Efficient collected (often using standard experimental computational methods are required for design techniques) with particular questions in finding the parameters for a model that mind, and then are analyzed to answer those optimize the score function. questions. Data mining is used for secondary data analysis – finding patterns and relationships • Data Management Strategy: While that we have no idea about initially. Thus data executing a data mining algorithm on a large mining helps in extracting hidden knowledge data base, the data sets have to be handled from data bases. The typical tasks in statistical efficiently. Designing a data base to store analysis are fitting a model, testing a hypothesis the data sets so that accessing subgroups of and predicting the confidence intervals. The data is as fast as possible, choosing the tasks in data mining are finding patterns (e.g., proper data structures and deciding which association rules), classification (e.g., bayes data sets need to be read into computer classifier, and neural networks), and grouping memory are all part of the data management (e.g., clustering). strategy. for earth sciences. Preliminary efforts in this area, focused in the domain of biology can be found in (Reino-Castillo et al 2003). As described in Section V, ontology specifies the terms or concepts in a domain and the relationships that exist between them. Thus ontologies represent a user’s prior knowledge about the domain. The data sets of the domain can be structured by associating them to the concepts of the ontology. Applying data mining algorithms to hierarchically structured data sets allows discovering relationships at multiple levels of abstraction as opposed to applying the Figure 7.1: Broad Classification of GeoRoc Data algorithms to the unstructured data sets at a set. single level of abstraction. Different users may have different perceptions of the same data. 4. Ontology Driven Data Mining Research in Hence, they can supply their own ontologies to Solid Earth Science at Virginia Tech structure the data for data analysis. Thus, The data mining research at Virginia Tech is incorporation of ontologies into data mining focused on combining ontologies and data algorithms facilitates multiple views of the same mining techniques to design novel algorithms data. 21
  22. 22. and oceanic arcs. For this purpose the data sets As a first step, the rock data collected from belonging to all the sub concepts of the concept “GeoRoc” data base (http://georoc.mpch- “continental margins” in the hierarchy should be mainz.gwdg.de/) is being analyzed using retrieved. Similarly all the data sets corresponding common data mining techniques. The data set to “oceanic arc” are also retrieved. Another user consists of geochemical analyses of different may choose to compare oceanic arcs with fast and types of rocks, geospatially distributed all over slow subducted plates based on a cut off. This is the world. In order to apply data mining more specific and at a lower level of abstraction algorithms to this data set we have created a as compared to earlier example. So we need only hierarchical representation of this data set to the specific data sets corresponding to these reflect recognized plate tectonic settings as settings. Application of data mining techniques at shown in Figure 7.1. Each of the classes can be multiple geospatial scales is likely to yield new further described using a detailed ontology. For knowledge on the geological processes operative instance, the detailed ontology of the class at different scales. Development of algorithms convergent margins is shown in Figure 7.2. that can be used at different scales is necessary because volume of data is expected to be significantly different when analysis is conducted for a single volcanic center vs. the entire arc. Research on data mining associated with sparse data will be coupled with those available for analysis of large data sets (Ramakrishnan et al 2002). Application of models for spatial data (Ramakrishnan et al 2004) is also under consideration. Section VIII. Information Integration Scenario 1. A Geoscientist’s Integration Scenario F igure 7.2: Simplified concept map representing the In order to justify ontologically driven sub-classes within the class of Convergent Margin knowledge discovery, we show an integration scenario where accessing a database through Convergent margin settings include simple ontological constraints is necessary. geometrical relationships between upper plate and Although more complex scenarios are more the subducted (lower) plate. In addition, the appropriate for geologic studies, this case study composition (continental, oceanic) of both plates clearly identifies a problem not adequately leads to well recognized geochemical affinities. addressed by existing ontologies, provides Similarities and differences between these informal semantics for objects and relations environments are further constrained by rate of included in the ontology and a motivation for subduction, angle of subduction as well as the age ontology development. We present the scenario of the plates involved in convergent margin in the form of a text-based query: “What is the settings. Similar ontologies for other classes exist distribution and U/Pb concordant zircon ages of as well. Through data mining techniques we A-type plutons on VA? How about their 3-D explore patterns and differences at various geometry? (Figure 8.1) geospatial resolutions for a given class. The knowledge represented by ontologies help in choosing the level of abstraction desired in applying the data mining algorithms. For example, in the case of convergent margins one user may choose to compare continental margins 22
  23. 23. Virginia, 11) deploy tool for calculating shape of pluton that best fits the gravity data12). Display the 3-D geometry of the plutons. These logical functions represent decomposition of the primary query into several sub-queries (Figure 8.2), each of which uses web based access to data and tools. Summary Statement Our research activities utilize the concept that extracting knowledge from static databases Figure 8.1: A Schematic representation of a requires well designed organizational structures geoscientist’s information integration problem that are able to identify relationships between resources being analyzed. Queries across To answer such a query, we have to follow a multiple databases, which contain information certain flow of information. The logical flow of that may be logically linked to each other, steps would include, but not limited to, these require organization of the concepts represented steps: 1) Locate the state of Virginia. 2) Identify in the data. Our research recognizes that all igneous rocks from geologic map. 3) Access schema-based queries of databases access the the database on igneous rocks. 4) Discriminate object view (within igneous rocks), but geologic between volcanic and plutonic rocks. 5) Filter research also needs to include integration of mineralogical and geochemical data for the processes that affect or produce the objects. plutons. 6) Apply discriminant functions to Therefore, the research goal at Virginia Tech is classify plutons as A-type. 7) Access age and to create a prototype of a computer based geochronological database. 8) Use zircon as knowledge environment that specifically reflects mineral for identifying U/Pb concordant age the logic used by a geoscientist, with the 9).locate gravity data base 10). Overlay gravity recognition that his/her primary interest lies in data over polygons of A-type plutons in understanding processes that have affected the rock record through time. 23
  24. 24. Figure 8.2: Shows the workflow of querying information and the output format selection process Eby, G.N., 1990. The A-type granitoids: a review of their occurrence and chemical characteristics and speculations on their References petrogenesis. Lithos, v. 26, p. 115-134. Altintas, I., Berkley, C., Jaeger, E., Jones, M. Farquhar, A., Fikes, R., Pratt W. & Rice, J., Ludascher, B., & Mock, S., 2004. Kepler: An 1995. Collaborative Ontology Construction for Extensible System for Design and Execution of Information Integration. Knowledge Systems Scientific Workflows. Laboratory, Department of Comp. Sci., Stanford http://www.sdsc.edu/~ludaesch/Paper/ssdbm04- University. http://www- kepler.pdf ksl.stanford.edu/KSL_Abstracts/KSL-95-63.htm l Chappell, B.W. & White, A.J.R., 2001. Two contrasting granite types: 25 years later. Goble, C.A., Stevens R., Ng, G., Bechhofer, S., Australian Journal of Earth Sciences, v. 48, p. Paton, N. W., Baker, P. G., Peim, M., & Brass, 489-499. A., 2001. Transparent access to multiple bioinformatics information sources. IBM cSIS (CyberSTRUCTURE information system), Systems journal, v. 40, no. 2, p.532-551 2003, http://www.sci.uidaho.edu/cyber/ Gruber, T.R., 1993. Toward Principles for the Deer, W.A., Howie, R.A., & Zussman, J., 1992. Design of Ontologies Used for Knowledge An introduction to The Rock-Forming Minerals. Sharing. Knowledge Systems Laboratory, Longman Publishers. Department of Comp. Sci., Stanford University. http://ksl- 24
  25. 25. web.stanford.edu/KSL_Abstracts/KSL-93-04.ht Ramakrishnan, N., & Bailey-Kellogg, C.K., ml 2002. Sampling Strategies for Mining in Data- Scarce Domains, IEEE/AIP Computing in Hollocher, K. , 2004. Calculation of a Norm Science and Engineering (CiSE), Vol. 4, No. 4 from a Bulk Chemical Analysis. http://www.union.edu/PUBLIC/GEODEPT/CO Ramakrishnan, N., Bailey-Kellogg, C.K., Satish, URSES/petrology/norms.htm T., & Pandey, V., 2004 Gaussian Processes for Active Data Mining, submitted to ACM King, P.L., White, A.J.R., Chappell, B.W. & KDD-2004 Allen, C.M., 1997. Characterization and origin of aluminous A-type granites from the Lachlan Reinoso-Castillo, J., Silvescu, A., Caragea, Fold Belt, Southeastern Australia. Journal of D., Pathak, J. and Honavar, V. (2003). Petrology, v. 38, no. 3, p. 371-391. Information Extraction and Integration from Heterogeneous, Distributed, Autonomous Lu, J., & Hsu, C., 2003. Query Answering Using Information Sources: A Federated, Query- Ontologies in Agent-based Resource Sharing Environment for Biological Web Information Centric Approach. in IEEE International Integration. 18th International Join Conference Conference on Information Integration and on Artificial Intelligence Reuse. MacKay, D., 1992. Bayesian interpolation. Sinha, A.K., Zendel, A., Brodaric, B., & Neural Computation, v.4-3, p. 415-447 Barnes,C., 2004. Schema to Ontology for Igneous Rocks: implications for the Maniatty, W., & Zaki, M., 2000. A requirements development of a cyber infrastructure for the analysis for parallel KDD systems. IPDPS earth sciences, in press. Workshop. Smith K.M., Welty C., & McGuinness, D.L., Munindar P. Singh, & Mladen A. Vouk, 1996. 2004. OWL Web Ontology Language Scientific Workflows: Scientific Computing Guide. W3C. http://www.w3.org/TR/2004/REC- Meets Transactional Workflows. NSF owl-guide-20040210 Workshop on Workflow and Process SWEET (Semantic web for earth and Automation in Information Systems environmental terminology), 2004, http://sweet.jpl.nasa.gov/ontology Noy, N. F., & McGuinness, D. L., 2001. Ontology Development 101: A Guide to Thearling, K., Becker, B., DeCoste, D., Mawby, Creating Your First Ontology. Knowledge B., Pilote, M., & Sommerfield, D., 2001. Systems Laboratory, Department of Comp. Sci., Visualizing Data Mining Models, Published by Stanford University. Morgan Kaufman http://www.ksl.stanford.edu/people/dlm/papers/ ontology101/ontology101-noy-mcguinness.html Whalen, J.B., Currie, K.L. & Chappell, B.W., 1987. A-type granites: geochemical Philpotts, AR., 1990. Principles of Igneous and characteristics, discrimination and Petrogenesis. Metamorphic Petrology. Prentice Hall, New Contributions to Mineralogy and Petrology, v. Jersey. 95, p. 407-419. Pitcher, W.S., 1983. Granite type and tectonic environment. In: Mountain Building Processes, Rock Classification Diagrams ed. Hsu, K. Academic Press, London, p. 19-40. Reference 25
  26. 26. 1. Cox, K.G., Bell, J.D., and Pankhurst, R.J., 12. Pearce, J.A., and Cann, J.R., 1973. Tectonic 1979. The interpretation of igneous rocks. setting of basic volcanic rocks determined George Allen & Unwin, London, United using trace element analyses. Earth and Kingdom (GBR). Planetary Science Letters, v. 19, Issue 2, p. 2. Gill, J.B., 1981. Orogenic andesites and 290-300. plate tectonics. Springer Verlag, Berlin, 13. Pearce, J.A., and Norry, M.J., 1979. Federal Republic of Germany (DEU). Petrogenetic implications of Ti, Zr, Y and 3. Irvine, T.N., and Baragar, W.R.A., 1971. A Nb variations in volcanic rocks. guide to the chemical classification of the Contributions to Mineralogy and Petrology, common volcanic rocks. Canadian Journal v. 69, p. 33-47. of Earth Sciences, v. 8, no. 5, p. 523-548. 14. Pearce, J.A., Harris, N.B.W., and Tindle, 4. Jensen, L.S., 1976. A new cation plot for A.G., 1984. Trace element discrimination classifying subalkalic volcanic rocks. diagrams for the tectonic interpretation of Ontario Geological Survey Miscellaneous granitic rocks. Journal of Petrology, v. 25, p. Paper, no. 66, 22 pp. 956-983. 5. LeBas, M.J., LeMaitre, R.W., Streckeisen, 15. Pearce, T.H., Gorman, B.E., and Birkett, A., and Zanettin, B., 1986. A chemical T.C., 1977. The relationship between major classification of volcanic rocks based on the element chemistry and tectonic environment total alkali-silica diagram. Journal of of basic and intermediate volcanic rocks. Petrology, v. 27, p. 745-750. Earth and Planetary Science Letters, v. 36, 6. Maniar, P.D., and Piccoli, P.M., 1989. Issue 1, p. 121-132. Tectonic discrimination of granitoids. 16. Shervais, J.W., 1982. Ti-V plots and the Geological Society of America Bulletin, v. petrogenesis of modern and ophiolitic lavas. 101, no. 5, p. 635-643. Earth and Planetary Science Letters, v. 59, Issue 1, p. 101-118. 7. Meschede, M., 1986. A method of 17. Streckeisen, A., 1979. To each plutonic rock discriminating between different types of its proper name. Earth Science Review, v. mid-ocean ridge basalts and continental 12, p. 1-33. tholeiites with the Nb-1bZr-1bY diagram. 18. Streckeisen, A., 1979. Classification and Chemical Geology, v. 56, Issues 3-4, p. nomenclature of volcanic rocks, 207-218. lamprophyres, carbonatities, and melilitic 8. Miyashiro, A., 1974. Volcanic rock series in rocks: recommendations and suggestions of island arcs and active continental margins. the IUGS Subcommission on the American Journal of Science, v. 274, no. 4, Systematics of Igneous Rocks. Geology, v. p. 321-355. 7, p. 331-335. 9. Mullen, E.D., 1983. MnO/TiO2/P2O5; a 19. Whalen, J.B., Currie, K.L., and Chappel, minor element discriminant for basaltic B.W., 1987. A-type granites: geochemical rocks of oceanic environments and its characteristics, discrimination and implications for petrogenesis. Earth and petrogenesis. Contributions to Mineralogy Planetary Science Letters, v. 62, Issue 1, p. and Petrology, v. 95, p. 407-419. 53-62. 20. Winchester, J.A., and Floyd, P.A., 1977. 10. Pearce, J.A., 1975. Basalt geochemistry used Geochemical discrimination of different to investigate past tectonic environment in magma series and differentiation products Cyprus. Tectonophysics, v. 25, Issues 1-2, p. using immobile elements. Chemical 41-67. Geology, v. 20, p. 325-343. 11. Pearce, J.A., 1996. Relationships between 21. Wood, D.A., 1980. The application of a Th- high field strength element geochemistry Hf-Ta diagram to problems of and tectonic setting of volcanic rocks. 30th tectonomagmatic classification and to International Geological Congress abstracts, establishing the nature of crustal vol. 30, v. 2, p. 367. contamination of basaltic lavas of the British Tertiary Volcanic Province. Earth and 26
  27. 27. Planetary Science Letters, v. 50, Issue 1, p. 11-30. 27