Building the eGrov Virtual Observatory

1,089 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,089
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Good afternoon my name is Gabriela Montiel-Moreno and I’m going to present our experiences in the definition and construction of e-GrOV an infrastructure for developing scientific environments over astronomical data.
  • This presentation is organized as follows. First I will present the problem related to the production and management of astronomical data and how virtual observatories aim to provide a solution to these problems. Then, I will present e-GrOV, our proposal for the construction of scientific environments and describe our our experiences related to the construction of the mini-grid and data integration in the e-GrOV. Finally, I will present the conclusions and the future work.
  • Nowadays, radio and spatial telescopes and sensor technologies produce great amounts of astronomical data under specific measurements. This data is stored in distributed repositories and is presented under different formats like images, documents, relational databases, xml files and applications. Astronomers use analyze these data to have a better understanding of the universe. For example, consider an astronomer who wants to identify all sky objects related to the Andromeda galaxy. Nowadays, the astronomer must carefully analyze each piece of data retrieved by telescopes from his/her observatory under specific measurements and compare them with other related data in order to produce new knowledge. However, this current process presents two main problems. First, due to the distribution of information an astronomer can ignore relevant information because it’s located in a different observatory and it’s inaccessible. Second, as astronomers build more telescopes and astronomical tools, the amount of astronomical data increases as well as its heterogeneity. Because of this, astronomers find a difficulty in the manual execution of analysis tasks and experiments over greats amounts of astronomical data in a reasonable time. For these reasons, astronomers require a system that provides transparent access to astronomical data without regarding about their location and integrates them according to their requirements. This system must have high computing and storage capabilities offered by different servers to provide the functionalities required to manage and store astronomical data in an efficient way. Virtual observatories emerged as a solution to these requirements. A virtual observatory is defined as a system that integrates astronomical databases, analysis tools, applications and multiple computational services to form a scientific environment in which astronomical research activities can be conducted. The main goal of a virtual observatory is to provide with transparent and distributed access to heterogeneous and distributed astronomical resources. Scientists are capable to discover, access, analyze, and combine data generated over heterogeneous resources.
  • Nowadays, radio and spatial telescopes and sensor technologies produce great amounts of astronomical data under specific measurements. This data is stored in distributed repositories and is presented under different formats like images, documents, relational databases, xml files and applications. Astronomers use analyze these data to have a better understanding of the universe. For example, consider an astronomer who wants to identify all sky objects related to the Andromeda galaxy. Nowadays, the astronomer must carefully analyze each piece of data retrieved by telescopes from his/her observatory under specific measurements and compare them with other related data in order to produce new knowledge. However, this current process presents two main problems. First, due to the distribution of information an astronomer can ignore relevant information because it’s located in a different observatory and it’s inaccessible. Second, as astronomers build more telescopes and astronomical tools, the amount of astronomical data increases as well as its heterogeneity. Because of this, astronomers find a difficulty in the manual execution of analysis tasks and experiments over greats amounts of astronomical data in a reasonable time. For these reasons, astronomers require a system that provides transparent access to astronomical data without regarding about their location and integrates them according to their requirements. This system must have high computing and storage capabilities offered by different servers to provide the functionalities required to manage and store astronomical data in an efficient way. Virtual observatories emerged as a solution to these requirements. A virtual observatory is defined as a system that integrates astronomical databases, analysis tools, applications and multiple computational services to form a scientific environment in which astronomical research activities can be conducted. The main goal of a virtual observatory is to provide with transparent and distributed access to heterogeneous and distributed astronomical resources. Scientists are capable to discover, access, analyze, and combine data generated over heterogeneous resources.
  • Nowadays, radio and spatial telescopes and sensor technologies produce great amounts of astronomical data under specific measurements. This data is stored in distributed repositories and is presented under different formats like images, documents, relational databases, xml files and applications. Astronomers use analyze these data to have a better understanding of the universe. For example, consider an astronomer who wants to identify all sky objects related to the Andromeda galaxy. Nowadays, the astronomer must carefully analyze each piece of data retrieved by telescopes from his/her observatory under specific measurements and compare them with other related data in order to produce new knowledge. However, this current process presents two main problems. First, due to the distribution of information an astronomer can ignore relevant information because it’s located in a different observatory and it’s inaccessible. Second, as astronomers build more telescopes and astronomical tools, the amount of astronomical data increases as well as its heterogeneity. Because of this, astronomers find a difficulty in the manual execution of analysis tasks and experiments over greats amounts of astronomical data in a reasonable time. For these reasons, astronomers require a system that provides transparent access to astronomical data without regarding about their location and integrates them according to their requirements. This system must have high computing and storage capabilities offered by different servers to provide the functionalities required to manage and store astronomical data in an efficient way. Virtual observatories emerged as a solution to these requirements. A virtual observatory is defined as a system that integrates astronomical databases, analysis tools, applications and multiple computational services to form a scientific environment in which astronomical research activities can be conducted. The main goal of a virtual observatory is to provide with transparent and distributed access to heterogeneous and distributed astronomical resources. Scientists are capable to discover, access, analyze, and combine data generated over heterogeneous resources.
  • Nowadays, radio and spatial telescopes and sensor technologies produce great amounts of astronomical data under specific measurements. This data is stored in distributed repositories and is presented under different formats like images, documents, relational databases, xml files and applications. Astronomers use analyze these data to have a better understanding of the universe. For example, consider an astronomer who wants to identify all sky objects related to the Andromeda galaxy. Nowadays, the astronomer must carefully analyze each piece of data retrieved by telescopes from his/her observatory under specific measurements and compare them with other related data in order to produce new knowledge. However, this current process presents two main problems. First, due to the distribution of information an astronomer can ignore relevant information because it’s located in a different observatory and it’s inaccessible. Second, as astronomers build more telescopes and astronomical tools, the amount of astronomical data increases as well as its heterogeneity. Because of this, astronomers find a difficulty in the manual execution of analysis tasks and experiments over greats amounts of astronomical data in a reasonable time. For these reasons, astronomers require a system that provides transparent access to astronomical data without regarding about their location and integrates them according to their requirements. This system must have high computing and storage capabilities offered by different servers to provide the functionalities required to manage and store astronomical data in an efficient way. Virtual observatories emerged as a solution to these requirements. A virtual observatory is defined as a system that integrates astronomical databases, analysis tools, applications and multiple computational services to form a scientific environment in which astronomical research activities can be conducted. The main goal of a virtual observatory is to provide with transparent and distributed access to heterogeneous and distributed astronomical resources. Scientists are capable to discover, access, analyze, and combine data generated over heterogeneous resources.
  • Nowadays, radio and spatial telescopes and sensor technologies produce great amounts of astronomical data under specific measurements. This data is stored in distributed repositories and is presented under different formats like images, documents, relational databases, xml files and applications. Astronomers use analyze these data to have a better understanding of the universe. For example, consider an astronomer who wants to identify all sky objects related to the Andromeda galaxy. Nowadays, the astronomer must carefully analyze each piece of data retrieved by telescopes from his/her observatory under specific measurements and compare them with other related data in order to produce new knowledge. However, this current process presents two main problems. First, due to the distribution of information an astronomer can ignore relevant information because it’s located in a different observatory and it’s inaccessible. Second, as astronomers build more telescopes and astronomical tools, the amount of astronomical data increases as well as its heterogeneity. Because of this, astronomers find a difficulty in the manual execution of analysis tasks and experiments over greats amounts of astronomical data in a reasonable time. For these reasons, astronomers require a system that provides transparent access to astronomical data without regarding about their location and integrates them according to their requirements. This system must have high computing and storage capabilities offered by different servers to provide the functionalities required to manage and store astronomical data in an efficient way. Virtual observatories emerged as a solution to these requirements. A virtual observatory is defined as a system that integrates astronomical databases, analysis tools, applications and multiple computational services to form a scientific environment in which astronomical research activities can be conducted. The main goal of a virtual observatory is to provide with transparent and distributed access to heterogeneous and distributed astronomical resources. Scientists are capable to discover, access, analyze, and combine data generated over heterogeneous resources.
  • Nowadays, radio and spatial telescopes and sensor technologies produce great amounts of astronomical data under specific measurements. This data is stored in distributed repositories and is presented under different formats like images, documents, relational databases, xml files and applications. Astronomers use analyze these data to have a better understanding of the universe. For example, consider an astronomer who wants to identify all sky objects related to the Andromeda galaxy. Nowadays, the astronomer must carefully analyze each piece of data retrieved by telescopes from his/her observatory under specific measurements and compare them with other related data in order to produce new knowledge. However, this current process presents two main problems. First, due to the distribution of information an astronomer can ignore relevant information because it’s located in a different observatory and it’s inaccessible. Second, as astronomers build more telescopes and astronomical tools, the amount of astronomical data increases as well as its heterogeneity. Because of this, astronomers find a difficulty in the manual execution of analysis tasks and experiments over greats amounts of astronomical data in a reasonable time. For these reasons, astronomers require a system that provides transparent access to astronomical data without regarding about their location and integrates them according to their requirements. This system must have high computing and storage capabilities offered by different servers to provide the functionalities required to manage and store astronomical data in an efficient way. Virtual observatories emerged as a solution to these requirements. A virtual observatory is defined as a system that integrates astronomical databases, analysis tools, applications and multiple computational services to form a scientific environment in which astronomical research activities can be conducted. The main goal of a virtual observatory is to provide with transparent and distributed access to heterogeneous and distributed astronomical resources. Scientists are capable to discover, access, analyze, and combine data generated over heterogeneous resources.
  • The construction of a virtual observatory represent to main challenges oriented to the definition of a communication infrastructure between astronomical resources and a middleware to promotes the exploitation of these resources. In order to access astronomical resources and transfer greats amounts of data, it is necessary to count with a communication network that provides a high data transference between different applications and provides reliability and security properties within transactions and the execution of scientific experiments. We believe Internet 2 provide the capabilities required to achieve the execution of experiments and access to astronomical data in a reasonable time. Additionally, we consider a Grid infrastructure to connect distributed resources to be visualized by any user in an integrated and controlled way. This infrastructure provides high scalability and specific name and directory services to identify resources connected to the Grid. A grid maximizes the value of computational resources using algorithms and scheduling techniques to efficiently manage resources. By the use of these tools, it is possible to execute a workload distribution among the nodes in the grid respect to their computing capabilities and availability. Thanks to these infrastructure the access to data is solved. Challenges respect to middleware development are analyzed according to the storage and management of data, and the execution of experiments. We require storage mechanisms that determine an efficient way to store information into distributed repositories, and execute replicas of information when it’s necessary. Also, it is necessary to define mechanisms that allows researchers to automatically identify and access astronomical data adapted to certain requirements. Finally, experiment environments aboard the necessity to have platforms supporting the execution of experiments and analysis tasks over greats amounts of astronomical data. The e-GrOV is an effort to support the requirements of astronomers to achieve research in astrophysics area. It provides a platform to execute scientific processes exploiting the knowledge and computing capabilities of sources distributed among the grid thanks to the support of Internet 2.
  • The construction of a virtual observatory represent to main challenges oriented to the definition of a communication infrastructure between astronomical resources and a middleware to promotes the exploitation of these resources. In order to access astronomical resources and transfer greats amounts of data, it is necessary to count with a communication network that provides a high data transference between different applications and provides reliability and security properties within transactions and the execution of scientific experiments. We believe Internet 2 provide the capabilities required to achieve the execution of experiments and access to astronomical data in a reasonable time. Additionally, we consider a Grid infrastructure to connect distributed resources to be visualized by any user in an integrated and controlled way. This infrastructure provides high scalability and specific name and directory services to identify resources connected to the Grid. A grid maximizes the value of computational resources using algorithms and scheduling techniques to efficiently manage resources. By the use of these tools, it is possible to execute a workload distribution among the nodes in the grid respect to their computing capabilities and availability. Thanks to these infrastructure the access to data is solved. Challenges respect to middleware development are analyzed according to the storage and management of data, and the execution of experiments. We require storage mechanisms that determine an efficient way to store information into distributed repositories, and execute replicas of information when it’s necessary. Also, it is necessary to define mechanisms that allows researchers to automatically identify and access astronomical data adapted to certain requirements. Finally, experiment environments aboard the necessity to have platforms supporting the execution of experiments and analysis tasks over greats amounts of astronomical data. The e-GrOV is an effort to support the requirements of astronomers to achieve research in astrophysics area. It provides a platform to execute scientific processes exploiting the knowledge and computing capabilities of sources distributed among the grid thanks to the support of Internet 2.
  • The e-GrOV provides a platform to organize, manage, give access and disseminate mega samples of astronomical data as well as their related tools and knowledge. e-GrOV interconnects and integrates astronomical data such as astronomical clusters, databases, analysis tools and applications located in different nodes. The e-GrOV platform will provide astronomers with support to execute massive and virtual analysis over astronomical data. This way, e-GrOV provides the foundations to the construction of a Mexican Virtual Observatory (MVO). By defining an MVO, astronomers will exploit great volumes of data and execute experiments over a reasonable time.
  • The e-GrOV provides a platform to organize, manage, give access and disseminate mega samples of astronomical data as well as their related tools and knowledge. e-GrOV interconnects and integrates astronomical data such as astronomical clusters, databases, analysis tools and applications located in different nodes. The e-GrOV platform will provide astronomers with support to execute massive and virtual analysis over astronomical data. This way, e-GrOV provides the foundations to the construction of a Mexican Virtual Observatory (MVO). By defining an MVO, astronomers will exploit great volumes of data and execute experiments over a reasonable time.
  • The e-GrOV provides a platform to organize, manage, give access and disseminate mega samples of astronomical data as well as their related tools and knowledge. e-GrOV interconnects and integrates astronomical data such as astronomical clusters, databases, analysis tools and applications located in different nodes. The e-GrOV platform will provide astronomers with support to execute massive and virtual analysis over astronomical data. This way, e-GrOV provides the foundations to the construction of a Mexican Virtual Observatory (MVO). By defining an MVO, astronomers will exploit great volumes of data and execute experiments over a reasonable time.
  • The construction of the e-GrOV is analyzed under three main phases: The first phase expects to produce a mini-grid infrastructure which offers a platform to access and manage data, services, and computing capabilities of astronomical resources. The second phase expects to produce an environment for the definition and execution of scientific experiments considering the services provided in the mini-grid.. Also, it will produce a mechanism for the management of experiments information by defining a database of experiments results. The last phase expects to develop a virtual library composed of documents describing the resources offered in the grid, user manuals, user experiences and comments about the execution of experiments. By the storage and management of virtual libraries, it is possible to represent the knowledge generated by astronomers after executing analysis tasks and then, promote the collaboration and information sharing between different communities.
  • Now, I will present our experiences related to the construction of the e-GrOV mini-grid.
  • The e-GrOV mini-grid is composed of three nodes located in three Mexican universities and research centers: the Universidad de las Américas Puebla, the University Autonomous of Tlaxcala and the National Institute of Astrophysics, Optics and Electronics. The role of every node in the mini-grid is to provide computing capabilities to support the storage of replicas of astronomical data from public and private surveys, and to provide an access point to execute queries and experiments over these data. Additionally, e-GrOV provides integrated and transparent access to a set of public astronomical data surveys available in the Web like 2Mass, SkyServer and SkyServerSpectra through ANDROMEDA mediation system.
  • Now, I will present how astronomical data is integrated in e-GrOV to define mega-databases for scientific experiments.
  • The main objective of the e-GrOV mini-grid is to provide access to heterogeneous astronomical data for executing experiments. For this reason, a mediation system enables the integration of astronomical data according to requirements characteristics. We use ANDROMEDA in order to generate mega databases of astronomical data. ANDROMEDA (Astronomical Data Resources Mediation) is a mediation system based in XML that provides transparent access to distributed astronomical data according to certain user’s requirements. Transparent access is achieved by a global as view strategy. The mediator implements the semi-structured data model as pivot model. Each source exports a local view of the data it contains through an exported schema implemented as an XML schema. In order to support semi-structured data and to hide the local specificities of each source, a wrapper ensures data accessibility in XML format. Astronomers express their requirements of a mediation schema. A mediation schema provides a unified global view of local data resources. The mediation system can be configured to give access to different data source according to user’s requirements (data types, content, data quality, source validity). This is achieved by identified the data elements from the schemas exported by the sources that are relevant to a query and which ones are not and can therefore be omitted. Relevant data is represented intentionally through a mapping schema. Then, the mediation system establishes which operations (e.g., join) can be used for merging mapping schemata to populate the mediation schema.
  • The main objective of the e-GrOV mini-grid is to provide access to heterogeneous astronomical data for executing experiments. For this reason, a mediation system enables the integration of astronomical data according to requirements characteristics. We use ANDROMEDA in order to generate mega databases of astronomical data. ANDROMEDA (Astronomical Data Resources Mediation) is a mediation system based in XML that provides transparent access to distributed astronomical data according to certain user’s requirements. Transparent access is achieved by a global as view strategy. The mediator implements the semi-structured data model as pivot model. Each source exports a local view of the data it contains through an exported schema implemented as an XML schema. In order to support semi-structured data and to hide the local specificities of each source, a wrapper ensures data accessibility in XML format. Astronomers express their requirements of a mediation schema. A mediation schema provides a unified global view of local data resources. The mediation system can be configured to give access to different data source according to user’s requirements (data types, content, data quality, source validity). This is achieved by identified the data elements from the schemas exported by the sources that are relevant to a query and which ones are not and can therefore be omitted. Relevant data is represented intentionally through a mapping schema. Then, the mediation system establishes which operations (e.g., join) can be used for merging mapping schemata to populate the mediation schema.
  • The main objective of the e-GrOV mini-grid is to provide access to heterogeneous astronomical data for executing experiments. For this reason, a mediation system enables the integration of astronomical data according to requirements characteristics. We use ANDROMEDA in order to generate mega databases of astronomical data. ANDROMEDA (Astronomical Data Resources Mediation) is a mediation system based in XML that provides transparent access to distributed astronomical data according to certain user’s requirements. Transparent access is achieved by a global as view strategy. The mediator implements the semi-structured data model as pivot model. Each source exports a local view of the data it contains through an exported schema implemented as an XML schema. In order to support semi-structured data and to hide the local specificities of each source, a wrapper ensures data accessibility in XML format. Astronomers express their requirements of a mediation schema. A mediation schema provides a unified global view of local data resources. The mediation system can be configured to give access to different data source according to user’s requirements (data types, content, data quality, source validity). This is achieved by identified the data elements from the schemas exported by the sources that are relevant to a query and which ones are not and can therefore be omitted. Relevant data is represented intentionally through a mapping schema. Then, the mediation system establishes which operations (e.g., join) can be used for merging mapping schemata to populate the mediation schema.
  • The main objective of the e-GrOV mini-grid is to provide access to heterogeneous astronomical data for executing experiments. For this reason, a mediation system enables the integration of astronomical data according to requirements characteristics. We use ANDROMEDA in order to generate mega databases of astronomical data. ANDROMEDA (Astronomical Data Resources Mediation) is a mediation system based in XML that provides transparent access to distributed astronomical data according to certain user’s requirements. Transparent access is achieved by a global as view strategy. The mediator implements the semi-structured data model as pivot model. Each source exports a local view of the data it contains through an exported schema implemented as an XML schema. In order to support semi-structured data and to hide the local specificities of each source, a wrapper ensures data accessibility in XML format. Astronomers express their requirements of a mediation schema. A mediation schema provides a unified global view of local data resources. The mediation system can be configured to give access to different data source according to user’s requirements (data types, content, data quality, source validity). This is achieved by identified the data elements from the schemas exported by the sources that are relevant to a query and which ones are not and can therefore be omitted. Relevant data is represented intentionally through a mapping schema. Then, the mediation system establishes which operations (e.g., join) can be used for merging mapping schemata to populate the mediation schema.
  • The main objective of the e-GrOV mini-grid is to provide access to heterogeneous astronomical data for executing experiments. For this reason, a mediation system enables the integration of astronomical data according to requirements characteristics. We use ANDROMEDA in order to generate mega databases of astronomical data. ANDROMEDA (Astronomical Data Resources Mediation) is a mediation system based in XML that provides transparent access to distributed astronomical data according to certain user’s requirements. Transparent access is achieved by a global as view strategy. The mediator implements the semi-structured data model as pivot model. Each source exports a local view of the data it contains through an exported schema implemented as an XML schema. In order to support semi-structured data and to hide the local specificities of each source, a wrapper ensures data accessibility in XML format. Astronomers express their requirements of a mediation schema. A mediation schema provides a unified global view of local data resources. The mediation system can be configured to give access to different data source according to user’s requirements (data types, content, data quality, source validity). This is achieved by identified the data elements from the schemas exported by the sources that are relevant to a query and which ones are not and can therefore be omitted. Relevant data is represented intentionally through a mapping schema. Then, the mediation system establishes which operations (e.g., join) can be used for merging mapping schemata to populate the mediation schema.
  • The main objective of the e-GrOV mini-grid is to provide access to heterogeneous astronomical data for executing experiments. For this reason, a mediation system enables the integration of astronomical data according to requirements characteristics. We use ANDROMEDA in order to generate mega databases of astronomical data. ANDROMEDA (Astronomical Data Resources Mediation) is a mediation system based in XML that provides transparent access to distributed astronomical data according to certain user’s requirements. Transparent access is achieved by a global as view strategy. The mediator implements the semi-structured data model as pivot model. Each source exports a local view of the data it contains through an exported schema implemented as an XML schema. In order to support semi-structured data and to hide the local specificities of each source, a wrapper ensures data accessibility in XML format. Astronomers express their requirements of a mediation schema. A mediation schema provides a unified global view of local data resources. The mediation system can be configured to give access to different data source according to user’s requirements (data types, content, data quality, source validity). This is achieved by identified the data elements from the schemas exported by the sources that are relevant to a query and which ones are not and can therefore be omitted. Relevant data is represented intentionally through a mapping schema. Then, the mediation system establishes which operations (e.g., join) can be used for merging mapping schemata to populate the mediation schema.
  • The main objective of the e-GrOV mini-grid is to provide access to heterogeneous astronomical data for executing experiments. For this reason, a mediation system enables the integration of astronomical data according to requirements characteristics. We use ANDROMEDA in order to generate mega databases of astronomical data. ANDROMEDA (Astronomical Data Resources Mediation) is a mediation system based in XML that provides transparent access to distributed astronomical data according to certain user’s requirements. Transparent access is achieved by a global as view strategy. The mediator implements the semi-structured data model as pivot model. Each source exports a local view of the data it contains through an exported schema implemented as an XML schema. In order to support semi-structured data and to hide the local specificities of each source, a wrapper ensures data accessibility in XML format. Astronomers express their requirements of a mediation schema. A mediation schema provides a unified global view of local data resources. The mediation system can be configured to give access to different data source according to user’s requirements (data types, content, data quality, source validity). This is achieved by identified the data elements from the schemas exported by the sources that are relevant to a query and which ones are not and can therefore be omitted. Relevant data is represented intentionally through a mapping schema. Then, the mediation system establishes which operations (e.g., join) can be used for merging mapping schemata to populate the mediation schema.
  • Next, I will present our validation experience over the construction of a mini-grid.
  • We execute two tests in order to validate the well construction of the e-GrOV mini-grid involving the data transference between the nodes and the definition of a database within a node in the mini-grid. The data transference test was executed by transferring four data files of different capacities from 20 MB to 3 GB. Each data file contains a copy of the schema and tuples from a relation of INAOE sloanm database. Transference of files was achieved by opening a secure ftp connection between the INAOE cluster and the e-grov1 and e-grov2 servers. The table presents the results obtained by the execution of the first connection test. The transference execution time in our system is reasonable when the server retrieves a file of approximately 3 GB. Estimating the cost of transferring a same file using Internet, we conclude that the transference time using Internet 2 is reduced in approximately one hour. By this improvement in transference time, e-GrOV can assure that results obtained from the execution of experiments can be retrieved to the astronomers in a reasonable time.
  • The database population test was executed by the migration of a partial copy of astronomical data from the INAOE cluster. This copy of astronomical data is stored in the skysurvey database in the e-grov1 and e-grov2 servers composed by three main relations. Migration of INAOE database was achieved by defining a backup file in the INAOE cluster. The backup file was transferred into the UDLA servers using a secure ftp connection between the INAOE cluster and the UDLA node The population of the database was achieved by exploiting the functionalities provided by PostgreSQL engine. The table presents the results obtained by the execution of the tuple insertion into the skysurvey database. The execution time in our system is reasonable when the server inserts approximately 500,000 tuples of 66 bytes. The cost of tuple insertion using a Java application can be affected by the PostgresSQL specifications and the characteristics of Internet 2 used to access the server.
  • Finally, I will present the conclusions identified at the end of the presentation and the future work in the e-GrOV project.
  • During this presentation we presented e-GrOV, a platform for the construction of scientific environments to execute scientific experiments over astronomical data and the exploitation of analysis tools and knowledge databases. This platform aims to organize, manage, give access and disseminate mega samples of astronomical data as well as their related tools and knowledge. However, it is necessary to have a cooperation agreement among participants to access private resources. The main result of the e-GrOV project is the implementation of a mini-grid which provides an infrastructure to provide transparent access to integrated and distributed astronomical data and gives the illusion to count with a single virtual system with high computing capabilities to develop scientific experiments. Use Internet 2 the bandwidth and capacities to transfer great amounts of data in a secure and efficient way
  • Future work in the e-GrOV involves the validation of the e-GrOV mini-grid by defining mega databases in the INAOE and UATx nodes using Andromeda. By defining these mega-databases it could be possible to build environments for the definition and execution of scientific experiments and virtual knowledge library. Finally, thanks to the development of a grid infrastructure, the e-GrOV project could be extended by establishing connection with astronomical databases in Mexico like the UNAM and the Baja California Observatory. Subsequently, e-GrOV will provide the infrastructure to the construction of a Mexican Virtual Observatory (MVO).
  • Thanks 
  • Building the eGrov Virtual Observatory

    1. 1. Building the eGrOV Virtual Observatory* Gabriela Montiel Moreno José-Luis Zechinelli-Martini Research Center of Information and Automation Technologies, CENTIA, UDLAP French-Mexican Laboratory of Computer Science and Automatic Control (LAFMIA UMI 3175) Genoveva Vargas-Solar French National Council of Scientific Research, LIG, Grenoble French-Mexican Laboratory of Computer Science and Automatic Control (LAFMIA UMI 3175)* Project financed by CUDI
    2. 2. Roadmap Context and motivation The e-GrOV project Building the e-GrOV virtual observatory  Mini-grid construction  Data integration in the e-GrOV  Validation experiments Conclusions and future work 2
    3. 3. Context and motivation 3
    4. 4. Context and motivation Which sky objects are related to the Andromeda galaxy? 4
    5. 5. Context and motivation Which sky objects are related to the Andromeda galaxy? 5
    6. 6. Context and motivation 6
    7. 7. Context and motivation 7
    8. 8. Context and motivation Virtual observatory 8
    9. 9. Towards a virtual observatory Storage & Data e-GrOV Middleware Experiment environment management Internet 2 Communication infrastructure 9
    10. 10. Towards a virtual observatory Middleware Grid infrastructure 10
    11. 11. The e-GrOV project Internet2 11
    12. 12. The e-GrOV project e-GrOV Internet2 12
    13. 13. The e-GrOV project e-GrOV Mexican Virtual observatory Internet2 13
    14. 14. e-GrOV results Mini-grid and mega-databases Infrastructure that offers transparent access and management of data, documents, libraries and computing capabilities located in the e-GrOV nodes Scientific workflows Environments for the definition, execution and storage of scientific experiments considering the services subscribed into the e-GrOV Virtual knowledge libraries Documents describing resources of the e-GrOV: user manuals, user experiences and comments about the execution of experiments 14
    15. 15. Roadmap Context and motivation The e-GrOV project Building the e-GrOV virtual observatory  Mini-grid construction  Data integration in the e-GrOV  Validation experiments Conclusions and future work 15
    16. 16. e-GrOV mini-grid construction Astronomical databases 2Mass Sky Server UDLA Intern Node et Sky Server Spectra Intern et 2 INAOE UATx Node NodeDa clu ta ste ba r se RAID 16
    17. 17. Nodes INAOE characteristics RAM Resource Storage Network Software Memory IBM x335 -A Wireless -Private databases: 1 GB 2 DD (73 GB) server connection of 20 sloanm, sloan, sloane.INAOE IBM DS300 Mbps to UDLA -PostgreSQL 7 DD (73 GB) Internet 2 RAID -2 connection to -Skysurvey private Internet of 34 database Mbps SUNFIre v880 -PostgreSQL engineUDLA 4 GB 7 DD (57 GB) -A connection to (2) -MySQL engine Internet 2 -Andromeda backbone mediation system MacBook Pro 2 GB 250 GB E-carrier links: E1 SUNFire v440 8 GBUATx and E3 (2) -BPELOnAsmL system SUNFire v210 4 GB (2) 17
    18. 18. Roadmap Context and motivation The e-GrOV project Building the e-GrOV virtual observatory  Mini-grid construction  Data integration in the e-GrOV  Validation experiments Conclusions and future work 18
    19. 19. Data integration using Andromeda1 Astronomical databases Sky survey 2Mass Sky Server Sky UDLA Andromeda Intern Server Node mediator et Spectra Intern et 2 Andromeda INAOE UATx mediator Node Node Sloanm1 Víctor Cuevas-Vicenttín, José-Luis Zechinelli-Martini, Genoveva Vargas-Solar: Andromeda : Building e-Science Data Integration Tools.DEXA 2006: 44-53 19
    20. 20. Data integration using Andromeda1 Astronomical databases Sky survey 2Mass SkySurvey Sky Wrapper 2Mass Server Wrapper SkyServer Sky Andromeda Intern Wrapper Server mediator et Spectra SSSpectra Wrapper Intern et 2 INAOE Andromeda Andromeda Wrapper mediator mediator Sloanm1 Víctor Cuevas-Vicenttín, José-Luis Zechinelli-Martini, Genoveva Vargas-Solar: Andromeda : Building e-Science Data Integration Tools.DEXA 2006: 44-53 20
    21. 21. Data integration using Andromeda1 Astronomical databases Sky survey 2Mass SkySurvey Sky Wrapper 2Mass Server Wrapper SkyServer Sky Andromeda Wrapper Server Spectra mediator SSSpectra Wrapper INAOE Andromeda Andromeda Wrapper mediator mediator Sloanm1 Víctor Cuevas-Vicenttín, José-Luis Zechinelli-Martini, Genoveva Vargas-Solar: Andromeda : Building e-Science Data Integration Tools.DEXA 2006: 44-53 21
    22. 22. Data integration using Andromeda1 Astronomical databases Sky survey 2Mass SkySurvey Andromeda Sky Wrapper 2Mass Server mediator Wrapper SkyServer Sky Wrapper Server Spectra SSSpectra Wrapper INAOE Wrapper Sloanm Andromeda Andromeda mediator mediator1 Víctor Cuevas-Vicenttín, José-Luis Zechinelli-Martini, Genoveva Vargas-Solar: Andromeda : Building e-Science Data Integration Tools.DEXA 2006: 44-53 22
    23. 23. Data integration using Andromeda1 Astronomical databases Sky survey 2Mass SkySurvey Sky Wrapper 2Mass Server Wrapper SkyServer Sky Andromeda Intern Wrapper Server mediator et Spectra SSSpectra Wrapper Intern et 2 INAOE Andromeda Andromeda Wrapper mediator mediator Sloanm1 Víctor Cuevas-Vicenttín, José-Luis Zechinelli-Martini, Genoveva Vargas-Solar: Andromeda : Building e-Science Data Integration Tools.DEXA 2006: 44-53 23
    24. 24. Data integration using Andromeda1 Astronomical databases Sky survey 2Mass SkySurvey Sky Wrapper 2Mass Server Wrapper SkyServer Sky Andromeda Wrapper Server Spectra mediator SSSpectra Wrapper INAOE Andromeda Andromeda Wrapper mediator mediator Sloanm1 Víctor Cuevas-Vicenttín, José-Luis Zechinelli-Martini, Genoveva Vargas-Solar: Andromeda : Building e-Science Data Integration Tools.DEXA 2006: 44-53 24
    25. 25. Data integration using Andromeda1 Astronomical databases Sky survey 2Mass SkySurvey Sky Wrapper 2Mass Server Wrapper SkyServer Sky Andromeda Wrapper Server Spectra mediator SSSpectra Wrapper INAOE Andromeda Andromeda Wrapper mediator mediator Sloanm1 Víctor Cuevas-Vicenttín, José-Luis Zechinelli-Martini, Genoveva Vargas-Solar: Andromeda : Building e-Science Data Integration Tools.DEXA 2006: 44-53 25
    26. 26. Roadmap Context and motivation The e-GrOV project Building the e-GrOV virtual observatory  Mini-grid construction  Data integration in the e-GrOV  Validation experiments Conclusions and future work 26
    27. 27. Data transference validation transference validation Transference of files (different sizes) between the UDLA and INAOE nodes Files contain a copy of the schema and tuples from relations in the INAOE sloanm database Use of a secure FTP connection between the INAOE cluster and e-grov1 and e-grov2 servers (UDLA) 27
    28. 28. Data transference validationDatabase population validation Migration of a partial copy of astronomical data from the INAOE cluster Definition of the skysurvey database in the UDLA node composed of three main relations Use of a secure FTP connection between the INAOE cluster and e-grov1 and e-grov2 servers (UDLA) Use of PostgreSQL capabilities for the population of the database 28
    29. 29. Roadmap Context and motivation The e-GrOV project Building the e-GrOV virtual observatory  Mini-grid construction  Data integration in the e-GrOV  Validation experiments Conclusions and future work 29
    30. 30. Conclusions E-GrOV: Platform for the construction of scientific environments  Organize, manage, give access and disseminate mega samples of astronomical data and their related tools and knowledge  Requires cooperation agreement among participants for accessing private resources Astronomical mini-grid  Internet 2 for providing high performance and bandwidth capabilities  Transparent access to distributed astronomical data  High computing capabilities for executing scientific experiments 30
    31. 31. Future work Validation of e-GrOV mini-grid  Definition of mega databases in the INAOE and UATx nodes using Andromeda  Definition and execution of scientific experiments  Definition of a virtual knowledge library Construction of a Mexican Virtual Observatory (MVO)  Extending e-GrOV with astronomical databases in the UNAM and the Baja California Observatory 31
    32. 32. Thank you! 32

    ×