Publication and long term archival of observational data in the field of environmental sciences is a challenging topic of today's eScience research. The amount of effort that goes into technical and scientific quality assurance prior to publication is considerable and might well turn out to be a barrier to data publication. Our project's goal is to lower the amount of manual effort and, at the same time, increase data quality in the process of submitting observational data for publication – in this case meteorological observational data. This goal is divided into the following subgoals:
Establish a standard procedure for the publication of observational data in the area of meteorology including quality information.
Develop a workflow system for the automatisation of the publication process.
Make the procedure usable for environmental sciences in general.
Integration of the procedure into an existing central data repository for meteorology (CERA data base at the World Data Center for Climate).
This talk is about the current state of the project from an eResearch and technical point of view.
The document discusses DataONE, a project aimed at improving data repository interoperability and advancing best practices in data lifecycle management. It focuses on enabling access to multiple external data repositories from within a HUB environment. This would allow users to aggregate and integrate disparate datasets for new analyses, and enable reproducible workflows. The goal is to address issues around scattered and dispersed data by improving discovery, integration and long-term preservation of datasets.
ESRI User Conference 2014 - A Location Aware Mobile Tool for Direct and Indir...Francisco Ramos
Access to GIS data from mobile platforms continues to be a challenge and there is a wide range of fields where it is extremely useful. In this work, we combined three key aspects: climate data sensors, mobile platforms and spatial proximity operations. We published and made use of a web 2.0 network of climate data, where content is user-collected, by means of their meteorological stations, and exposed as available information for the virtual community. Moreover, we enriched this data by giving the users the opportu- nity to directly inform the system with different climate measures. In general, management of this type of information from a mobile application could result in an important decision tool, as it enables us to provide climate-related data according to a context and a geographical location. Therefore, we imple- mented a native mobile application for iPhone and iPad platforms by using ArcGIS SDK for iOS and by integrating a series of ArcGIS webmaps, which allows us to perform geospatial queries based on the user’s location, offering, at the same time, access to all the data provided by the climate data sensor network and from direct users.
Interpreting Climate Data - Analysing climate vulnerability- online training ...Vestlandsforsking WRNI
Interpreting Climate Data
This module provides an introduction to climate data and how to effectively use it. The following will be covered:
How regionalised climate data is produced
How to understand and interpret regionalised climate data
How to identify and communicate uncertainties
Climate Data Sharing for Urban Resilience - OGC Testbed 11George Percivall
OGC Testbed 11:
Delivering on our commitment to the Climate Data Initiative
In December 2014 the US White House Office of Science and Technology (OSTP) released a Policy Fact Sheet titled "Harnessing Climate Data to Boost Ecosystem & Water Resilience." The Fact Sheet includes OGC’s commitment to increase open access to climate change information using open standards. Testbed 11 results are now available delivering on that commitment.
The results of this major interoperability testbed contribute to development and refinement of international standards that are critical for the communication and integration of geospatial information. http://www.opengeospatial.org/projects/initiatives/testbed11
• Nine sponsors provided requirements and funding for Testbed 11.
• Thirty organizations participated in Testbed 11 by contributing prototypes, engineering
reports and participating in a scenario driven demonstration of the technical advances Technical results of Testbed 11 relevant to the Climate Data Initiative include:
• Analysis and prediction based on open climate data accessed using open standards
• Making predictive models more accessible with OGC Web Processing Service (WPS)
• Verifying model predictions using mobile operations, in-situ gauges and social media.
Climate adaptation, resilience and security planning based on technology from OGC Testbed 11:
• Estimating geographic extend of coastal inundation in dynamic weather conditions
• Assessing social unrest with displaced population due to climate change
• Integrating spatial and non-spatial models of human geography and resilience
• Predictive models and verifications to support planning and response phases
The document discusses using the raster package in R to work with geographical grid data. It covers downloading and loading the raster package, creating raster objects and adding random values, reading in real climate data files, performing operations like cropping and aggregation, and sources for global climate data like WorldClim.
This document provides guidance on effectively communicating technical information to different audiences. It discusses documenting processes, systems and user manuals to create consistency, enable training and ensure compliance. It emphasizes standardizing templates and best practices. The document also explains how to customize communications by considering the audience, their needs and level of technical knowledge. It categorizes typical audience groups as co-workers, management, executives and users, and provides examples of what each would need to know to make decisions. The key message is to think about the audience and summarize information simply rather than using jargon or talking above their level in order to gain acceptance and approval for initiatives.
Edwards climate data detectives - yale 2-2015Paul Edwards
Talk to the Yale History of Science/History of Medicine group on 9 Feb 2015. Discusses evolving knowledge infrastructures, the history of weather and climate data, and several recent controversies over climate data. "Data detectives" refers to the forensic work involved in adjusting data for changes in instrument characteristics, location, etc.
Dolphin Data Lab is setting up data recovery service centers in countries around the world, including Saudi Arabia. The document provides background information on Saudi Arabia, such as its country code, capital, languages, and currency. It states that Dolphin Data Lab is becoming known in Saudi Arabia for its data recovery, security hardware, and software solutions. The document also lists several data recovery tools that Dolphin Data Lab provides.
The document discusses DataONE, a project aimed at improving data repository interoperability and advancing best practices in data lifecycle management. It focuses on enabling access to multiple external data repositories from within a HUB environment. This would allow users to aggregate and integrate disparate datasets for new analyses, and enable reproducible workflows. The goal is to address issues around scattered and dispersed data by improving discovery, integration and long-term preservation of datasets.
ESRI User Conference 2014 - A Location Aware Mobile Tool for Direct and Indir...Francisco Ramos
Access to GIS data from mobile platforms continues to be a challenge and there is a wide range of fields where it is extremely useful. In this work, we combined three key aspects: climate data sensors, mobile platforms and spatial proximity operations. We published and made use of a web 2.0 network of climate data, where content is user-collected, by means of their meteorological stations, and exposed as available information for the virtual community. Moreover, we enriched this data by giving the users the opportu- nity to directly inform the system with different climate measures. In general, management of this type of information from a mobile application could result in an important decision tool, as it enables us to provide climate-related data according to a context and a geographical location. Therefore, we imple- mented a native mobile application for iPhone and iPad platforms by using ArcGIS SDK for iOS and by integrating a series of ArcGIS webmaps, which allows us to perform geospatial queries based on the user’s location, offering, at the same time, access to all the data provided by the climate data sensor network and from direct users.
Interpreting Climate Data - Analysing climate vulnerability- online training ...Vestlandsforsking WRNI
Interpreting Climate Data
This module provides an introduction to climate data and how to effectively use it. The following will be covered:
How regionalised climate data is produced
How to understand and interpret regionalised climate data
How to identify and communicate uncertainties
Climate Data Sharing for Urban Resilience - OGC Testbed 11George Percivall
OGC Testbed 11:
Delivering on our commitment to the Climate Data Initiative
In December 2014 the US White House Office of Science and Technology (OSTP) released a Policy Fact Sheet titled "Harnessing Climate Data to Boost Ecosystem & Water Resilience." The Fact Sheet includes OGC’s commitment to increase open access to climate change information using open standards. Testbed 11 results are now available delivering on that commitment.
The results of this major interoperability testbed contribute to development and refinement of international standards that are critical for the communication and integration of geospatial information. http://www.opengeospatial.org/projects/initiatives/testbed11
• Nine sponsors provided requirements and funding for Testbed 11.
• Thirty organizations participated in Testbed 11 by contributing prototypes, engineering
reports and participating in a scenario driven demonstration of the technical advances Technical results of Testbed 11 relevant to the Climate Data Initiative include:
• Analysis and prediction based on open climate data accessed using open standards
• Making predictive models more accessible with OGC Web Processing Service (WPS)
• Verifying model predictions using mobile operations, in-situ gauges and social media.
Climate adaptation, resilience and security planning based on technology from OGC Testbed 11:
• Estimating geographic extend of coastal inundation in dynamic weather conditions
• Assessing social unrest with displaced population due to climate change
• Integrating spatial and non-spatial models of human geography and resilience
• Predictive models and verifications to support planning and response phases
The document discusses using the raster package in R to work with geographical grid data. It covers downloading and loading the raster package, creating raster objects and adding random values, reading in real climate data files, performing operations like cropping and aggregation, and sources for global climate data like WorldClim.
This document provides guidance on effectively communicating technical information to different audiences. It discusses documenting processes, systems and user manuals to create consistency, enable training and ensure compliance. It emphasizes standardizing templates and best practices. The document also explains how to customize communications by considering the audience, their needs and level of technical knowledge. It categorizes typical audience groups as co-workers, management, executives and users, and provides examples of what each would need to know to make decisions. The key message is to think about the audience and summarize information simply rather than using jargon or talking above their level in order to gain acceptance and approval for initiatives.
Edwards climate data detectives - yale 2-2015Paul Edwards
Talk to the Yale History of Science/History of Medicine group on 9 Feb 2015. Discusses evolving knowledge infrastructures, the history of weather and climate data, and several recent controversies over climate data. "Data detectives" refers to the forensic work involved in adjusting data for changes in instrument characteristics, location, etc.
Dolphin Data Lab is setting up data recovery service centers in countries around the world, including Saudi Arabia. The document provides background information on Saudi Arabia, such as its country code, capital, languages, and currency. It states that Dolphin Data Lab is becoming known in Saudi Arabia for its data recovery, security hardware, and software solutions. The document also lists several data recovery tools that Dolphin Data Lab provides.
ICLR Friday Forum: Climate data in Ontario (November 13, 2015)glennmcgillivray
Ryan Ness presented on climate data and information needs in Ontario. He discussed the importance of climate information for adaptation planning and highlighted some of the challenges users face, including uncertainty and scale mismatches. While climate data is available, users need help applying it appropriately. Current ad-hoc approaches can lead to inconsistent results and ineffective adaptation. Looking ahead, the focus should be on flexible, resilient solutions rather than overdesigned infrastructure. The Ontario Climate Consortium aims to provide region-specific climate expertise to help decision-makers meet adaptation needs.
Contextualizing the Visualization of Climate DataRaquel Alegre
EGU 2014, 27th April - 2nd May 2014, Vienna (Austria)
Session: Techniques and tools for effective visualization and sonification in the geosciences
Category: Earth & Space Science Informatics (ESSI)
Saudi Arabia is almost entirely reliant on domestic oil for its energy needs and economic development. While it has pledged to increase renewable energy to 33% by 2032, it currently obtains most of its government revenue from oil exports. At UN climate conferences, Saudi Arabia has historically taken an obstructionist position and demanded compensation from developed countries for any potential economic losses from climate change mitigation efforts. It views itself as a developing country without binding emissions targets and wants to maximize profits from its large oil reserves before transitioning to renewable energy.
Saudi Arabia faces significant water scarcity issues due to limited natural water sources like rivers and lakes as well as low rainfall. Most of its water comes from desalination plants and dams. If water usage continues to increase at the current rate, 70% of the population may lack access to safe drinking water by 2030. Australia has more abundant water sources but also experiences water issues in some inland, arid areas with low rainfall like Western Australia. Both countries emphasize sustainable water usage, collection, and conservation efforts to ensure long-term water security for their growing populations.
Collaborate 2012: Environmental Accounting and ReportingAngela Miller
Presentation on the implementation of Oracle's Environmental Accounting and Reporting (EAR) module in JD Edwards. As more entities desire to self-report to The Climate Registry and the Global Reporting Initiative, tools like EAR will become imperative for their organizations to automate the data collection.
Saudi Arabia is a hot and dry desert country ruled by a monarchy. It has a population of over 28 million people, almost all of whom are Muslim. Women have few rights and must have a male guardian. The country contains the holiest sites in Islam and relies heavily on oil exports, with petroleum making up 90% of its exports. Saudi Arabia maintains strong ties with the United States due to its role in the Arab world and large oil reserves.
These visuals were prepared to support a string quartet performance and panel on climate change at Northwestern University in February 2106.
A well-designed graphic can help audiences to quickly understand the main message embedded within a complex set of climate data and to retain those ideas longer than they would have if they were conveyed by words alone. But the visual aids used regularly by climate scientists also have their limitations: they are most easily understood by people who are already fluent in technical illustrations; they're usually static and sometimes do not tell an obvious story; and for many, they don't elicit a strong emotional response.
Music, by contrast, is inherently narrative and is known to exert a powerful influence on human emotions. Because of this, sonification — the transformation of data into acoustic signals — may have considerable promise as a tool to enhance the communication of climate science.
Daniel Crawford and Scott St. George report on a collaboration between scientists and artists that uses music to transmit the evidence of climate change in an engaging and visceral way.
Big Data and the Climate/Environment domain (vis-a-vis the respective H2020 Societal Challenge) - Opportunities, Challenges and Requirements. As presented and discussed in the public launch of the BigDataEurope project.
Saudi Aramco has developed a comprehensive technology roadmap for carbon management that focuses on CO2 capture, storage, and utilization. As part of this roadmap, Saudi Aramco plans to implement a CO2-EOR demonstration project in Saudi Arabia by 2013/2014. This project will inject CO2 into an oil field using 4 injectors and 4 producers to test CO2-EOR and monitor CO2 storage. It aims to evaluate CO2 potential for EOR and storage while testing new monitoring and surveillance technologies to track the CO2 plume and changes in oil saturation.
The Role of DAta for Climate Monitoring and PredictionNAP Events
Presentation by: Stefan Rösner
4.1 Climate services in support of NAPs
This event will bring together experts involved in the provision of climate services and testimony from countries of how climate services are being used to support decision-making and effective adaptation. The event will start with brief statements, and will be followed by a panel discussion, where participants from the floor will have the opportunity to engage the panelists with questions or comments. The panel will demonstrate the practical benefits of climate services in support of climate risk management and adaptation to climate variability and change. It will also provide lessons learned through various activities being implemented at regional and national level.
Presentation on groundwater management in Saudi Arabia by Dr. Ali Saad Al-Tokhais at the International Annual UN-Water Zaragoza Conference 2012/2013. Preparing for the 2013 International Year. Water Cooperation: Making it Happen! 8-10 January 2013.
The Gulf Cooperation Council (GCC) was established in 1981 between Saudi Arabia, Kuwait, Bahrain, Qatar, Oman, and the United Arab Emirates to promote coordination and unity between the member states. In later years, the GCC eliminated tariffs between members, established a common market and aims to create a single currency union. The member states are primarily hot desert climates dominated by oil and gas economies.
Best practices and tips on how to design and develop a Data Warehouse using Microsoft SQL Server BI products.
This presentation describes the inception and full lifecycle of the Carl Zeiss Vision corporate enterprise data warehouse.
Technologies covered include:
•Using SQL Server 2008 as your data warehouse DB
•SSIS as your ETL Tool
•SSAS as your data cube Tool
You will Learn:
•How to Architect a data warehouse system from End-to-End
•Components of the data warehouse and functionality
•How to Profile data and understand your source systems
•Whether to ODS or not to ODS (Determining if a operational Data Store is required)
•The staging area of the data warehouse
•How to Build the data warehouse – Designing Dimensions and Fact tables
•The Importance of using Conformed Dimensions
•ETL – Moving data through your data warehouse system
•Data Cubes - OLAP
•Lessons learned from Zeiss and other projects
Climate data in india - Open and ClosedPavan Srinath
A presentation made at Open Data Camp, Bangalore, on March 24, 2012.
A part of the Know Your Climate initiative (knowyourclimate.org ) of Public Affairs Centre.
This document provides information about a geography lesson on climate for classes 9 and 11 presented by Ashutosh Karasharma Mishra. It includes the presenter's contact information and objectives of the lesson which are to familiarize students with climate concepts and mechanisms, study India's climate through local analysis, and understand spatial and seasonal climate variations. The content covers climatic diversity in India, factors affecting climate, rhythm of seasons, rainfall distribution, and climate change. Interactive activities for students are suggested.
The Open Science Data Cloud is a hosted, managed, distributed facility that allows scientists to manage and archive medium and large datasets, provide computational resources to analyze the data, and share the data with colleagues and the public. It currently consists of 6 racks, 212 nodes, 1568 cores and 0.9 PB of storage across 4 locations with 10G networks. Projects using the Open Science Data Cloud include Bionimbus for hosting genomics data and Matsu 2 for providing flood data to disaster response teams. The goal is to build it out over the next 10 years into a small data center for science that can preserve data like libraries and museums preserve collections.
DataCite is a global consortium that provides persistent identifiers (DOIs) for scientific data to make it easily discoverable and citable. It aims to put datasets on the same level as research articles. DataCite has over 1.7 million DOIs registered and many member organizations worldwide. It develops standards and infrastructure like its metadata schema and search portal to help data archives and researchers globally.
Cyberenvironments integrate shared and custom cyberinfrastructure resources into a process-oriented framework to support scientific communities and allow researchers to focus on their work rather than managing infrastructure. They enable more complex multi-disciplinary challenges to be tackled through enhanced knowledge production and application. Key challenges include coordinating distributed resources and users without centralization and evolving systems rapidly to keep pace with advancing science.
DSD-INT 2015 - Addressing high resolution modelling over different computing ...Deltares
This document discusses using high performance computing resources to model water quality and hydrodynamics in reservoirs. It summarizes work using the Delft3D model on the Cuerda del Pozo reservoir, which experiences eutrophication issues. Testing was conducted on supercomputers and cloud infrastructures to determine appropriate resources for high resolution models. The work provides a useful demonstration case for infrastructure projects seeking water quality modeling applications. Addressing eutrophication through modeling can help alert authorities to water quality issues before they occur.
DSD-INT 2015 - Addressing high resolution modelling over different computing ...Deltares
This document discusses using high performance computing resources to model water quality and hydrodynamics in reservoirs. It summarizes work using the Delft3D model on the Cuerda del Pozo reservoir, which experiences eutrophication issues. The modeling aims to reproduce conditions like algae blooms and alert authorities before water quality deteriorates. While hydrodynamics modeling was successful, water quality modeling had issues to be addressed. The document also discusses using cloud resources through the EGI FedCloud to run the high resolution models, as well as potential applications of this use case for biodiversity infrastructure projects like LifeWatch and INDIGO-DataCloud.
ICLR Friday Forum: Climate data in Ontario (November 13, 2015)glennmcgillivray
Ryan Ness presented on climate data and information needs in Ontario. He discussed the importance of climate information for adaptation planning and highlighted some of the challenges users face, including uncertainty and scale mismatches. While climate data is available, users need help applying it appropriately. Current ad-hoc approaches can lead to inconsistent results and ineffective adaptation. Looking ahead, the focus should be on flexible, resilient solutions rather than overdesigned infrastructure. The Ontario Climate Consortium aims to provide region-specific climate expertise to help decision-makers meet adaptation needs.
Contextualizing the Visualization of Climate DataRaquel Alegre
EGU 2014, 27th April - 2nd May 2014, Vienna (Austria)
Session: Techniques and tools for effective visualization and sonification in the geosciences
Category: Earth & Space Science Informatics (ESSI)
Saudi Arabia is almost entirely reliant on domestic oil for its energy needs and economic development. While it has pledged to increase renewable energy to 33% by 2032, it currently obtains most of its government revenue from oil exports. At UN climate conferences, Saudi Arabia has historically taken an obstructionist position and demanded compensation from developed countries for any potential economic losses from climate change mitigation efforts. It views itself as a developing country without binding emissions targets and wants to maximize profits from its large oil reserves before transitioning to renewable energy.
Saudi Arabia faces significant water scarcity issues due to limited natural water sources like rivers and lakes as well as low rainfall. Most of its water comes from desalination plants and dams. If water usage continues to increase at the current rate, 70% of the population may lack access to safe drinking water by 2030. Australia has more abundant water sources but also experiences water issues in some inland, arid areas with low rainfall like Western Australia. Both countries emphasize sustainable water usage, collection, and conservation efforts to ensure long-term water security for their growing populations.
Collaborate 2012: Environmental Accounting and ReportingAngela Miller
Presentation on the implementation of Oracle's Environmental Accounting and Reporting (EAR) module in JD Edwards. As more entities desire to self-report to The Climate Registry and the Global Reporting Initiative, tools like EAR will become imperative for their organizations to automate the data collection.
Saudi Arabia is a hot and dry desert country ruled by a monarchy. It has a population of over 28 million people, almost all of whom are Muslim. Women have few rights and must have a male guardian. The country contains the holiest sites in Islam and relies heavily on oil exports, with petroleum making up 90% of its exports. Saudi Arabia maintains strong ties with the United States due to its role in the Arab world and large oil reserves.
These visuals were prepared to support a string quartet performance and panel on climate change at Northwestern University in February 2106.
A well-designed graphic can help audiences to quickly understand the main message embedded within a complex set of climate data and to retain those ideas longer than they would have if they were conveyed by words alone. But the visual aids used regularly by climate scientists also have their limitations: they are most easily understood by people who are already fluent in technical illustrations; they're usually static and sometimes do not tell an obvious story; and for many, they don't elicit a strong emotional response.
Music, by contrast, is inherently narrative and is known to exert a powerful influence on human emotions. Because of this, sonification — the transformation of data into acoustic signals — may have considerable promise as a tool to enhance the communication of climate science.
Daniel Crawford and Scott St. George report on a collaboration between scientists and artists that uses music to transmit the evidence of climate change in an engaging and visceral way.
Big Data and the Climate/Environment domain (vis-a-vis the respective H2020 Societal Challenge) - Opportunities, Challenges and Requirements. As presented and discussed in the public launch of the BigDataEurope project.
Saudi Aramco has developed a comprehensive technology roadmap for carbon management that focuses on CO2 capture, storage, and utilization. As part of this roadmap, Saudi Aramco plans to implement a CO2-EOR demonstration project in Saudi Arabia by 2013/2014. This project will inject CO2 into an oil field using 4 injectors and 4 producers to test CO2-EOR and monitor CO2 storage. It aims to evaluate CO2 potential for EOR and storage while testing new monitoring and surveillance technologies to track the CO2 plume and changes in oil saturation.
The Role of DAta for Climate Monitoring and PredictionNAP Events
Presentation by: Stefan Rösner
4.1 Climate services in support of NAPs
This event will bring together experts involved in the provision of climate services and testimony from countries of how climate services are being used to support decision-making and effective adaptation. The event will start with brief statements, and will be followed by a panel discussion, where participants from the floor will have the opportunity to engage the panelists with questions or comments. The panel will demonstrate the practical benefits of climate services in support of climate risk management and adaptation to climate variability and change. It will also provide lessons learned through various activities being implemented at regional and national level.
Presentation on groundwater management in Saudi Arabia by Dr. Ali Saad Al-Tokhais at the International Annual UN-Water Zaragoza Conference 2012/2013. Preparing for the 2013 International Year. Water Cooperation: Making it Happen! 8-10 January 2013.
The Gulf Cooperation Council (GCC) was established in 1981 between Saudi Arabia, Kuwait, Bahrain, Qatar, Oman, and the United Arab Emirates to promote coordination and unity between the member states. In later years, the GCC eliminated tariffs between members, established a common market and aims to create a single currency union. The member states are primarily hot desert climates dominated by oil and gas economies.
Best practices and tips on how to design and develop a Data Warehouse using Microsoft SQL Server BI products.
This presentation describes the inception and full lifecycle of the Carl Zeiss Vision corporate enterprise data warehouse.
Technologies covered include:
•Using SQL Server 2008 as your data warehouse DB
•SSIS as your ETL Tool
•SSAS as your data cube Tool
You will Learn:
•How to Architect a data warehouse system from End-to-End
•Components of the data warehouse and functionality
•How to Profile data and understand your source systems
•Whether to ODS or not to ODS (Determining if a operational Data Store is required)
•The staging area of the data warehouse
•How to Build the data warehouse – Designing Dimensions and Fact tables
•The Importance of using Conformed Dimensions
•ETL – Moving data through your data warehouse system
•Data Cubes - OLAP
•Lessons learned from Zeiss and other projects
Climate data in india - Open and ClosedPavan Srinath
A presentation made at Open Data Camp, Bangalore, on March 24, 2012.
A part of the Know Your Climate initiative (knowyourclimate.org ) of Public Affairs Centre.
This document provides information about a geography lesson on climate for classes 9 and 11 presented by Ashutosh Karasharma Mishra. It includes the presenter's contact information and objectives of the lesson which are to familiarize students with climate concepts and mechanisms, study India's climate through local analysis, and understand spatial and seasonal climate variations. The content covers climatic diversity in India, factors affecting climate, rhythm of seasons, rainfall distribution, and climate change. Interactive activities for students are suggested.
The Open Science Data Cloud is a hosted, managed, distributed facility that allows scientists to manage and archive medium and large datasets, provide computational resources to analyze the data, and share the data with colleagues and the public. It currently consists of 6 racks, 212 nodes, 1568 cores and 0.9 PB of storage across 4 locations with 10G networks. Projects using the Open Science Data Cloud include Bionimbus for hosting genomics data and Matsu 2 for providing flood data to disaster response teams. The goal is to build it out over the next 10 years into a small data center for science that can preserve data like libraries and museums preserve collections.
DataCite is a global consortium that provides persistent identifiers (DOIs) for scientific data to make it easily discoverable and citable. It aims to put datasets on the same level as research articles. DataCite has over 1.7 million DOIs registered and many member organizations worldwide. It develops standards and infrastructure like its metadata schema and search portal to help data archives and researchers globally.
Cyberenvironments integrate shared and custom cyberinfrastructure resources into a process-oriented framework to support scientific communities and allow researchers to focus on their work rather than managing infrastructure. They enable more complex multi-disciplinary challenges to be tackled through enhanced knowledge production and application. Key challenges include coordinating distributed resources and users without centralization and evolving systems rapidly to keep pace with advancing science.
DSD-INT 2015 - Addressing high resolution modelling over different computing ...Deltares
This document discusses using high performance computing resources to model water quality and hydrodynamics in reservoirs. It summarizes work using the Delft3D model on the Cuerda del Pozo reservoir, which experiences eutrophication issues. Testing was conducted on supercomputers and cloud infrastructures to determine appropriate resources for high resolution models. The work provides a useful demonstration case for infrastructure projects seeking water quality modeling applications. Addressing eutrophication through modeling can help alert authorities to water quality issues before they occur.
DSD-INT 2015 - Addressing high resolution modelling over different computing ...Deltares
This document discusses using high performance computing resources to model water quality and hydrodynamics in reservoirs. It summarizes work using the Delft3D model on the Cuerda del Pozo reservoir, which experiences eutrophication issues. The modeling aims to reproduce conditions like algae blooms and alert authorities before water quality deteriorates. While hydrodynamics modeling was successful, water quality modeling had issues to be addressed. The document also discusses using cloud resources through the EGI FedCloud to run the high resolution models, as well as potential applications of this use case for biodiversity infrastructure projects like LifeWatch and INDIGO-DataCloud.
Tim Malthus_Towards standards for the exchange of field spectral datasetsTERN Australia
This document discusses the development of standards for the exchange of field spectral datasets. It notes the importance of metadata for determining the quality and representativeness of spectral data obtained in the field. A workshop was held in 2012 to discuss best practices for data collection and exchange and key conclusions included the need for standards to facilitate accurate comparison across studies and the role of thorough metadata. Work is ongoing to enhance the SPECCHIO system for hosting spectral libraries and metadata and establishing it as the international tool for storage and exchange of spectral datasets.
Tim Osborn: Research Integrity: Integrity of the published recordJisc
The document discusses several key points regarding climate change research data and integrity:
1) Climate change research faces intense scrutiny given its high stakes, making data integrity especially important.
2) While improved data management may not have changed criticisms of a controversial email leak, the climate science community should still improve data sharing and linking publications to underlying data and analyses.
3) Doing so supports reproducibility, research integrity, further exploration of findings, and data reuse, helping inform important decisions around climate change. However, challenges like data volumes, disclosure agreements, resources, and unclear benefits must also be considered.
The document describes the development of a Hydrologic Community Modeling System (HCMS) using a workflow engine called TRIDENT. The HCMS will allow for modular and integrated hydrologic models with interchangeable components. It will include libraries for data access, processing, hydrologic models, and post-analysis tools. Example applications to the Schuylkill Watershed are provided to demonstrate watershed delineation, hydrologic response unit creation, meteorological data processing, and potential evapotranspiration calculation workflows.
SCENZ-Grid proposes a science collaboration and computation environment to enable researchers to do science research online together by sharing data and computational resources without duplicating them. It would allow researchers to collaboratively develop and use shared models and workflows, and connect researchers directly to policymakers, managers, educators and the public. The environment would use spatial, computational and web services and technologies like Web Processing Services, 52North, GeoServer and OpenLayers to query and analyze data from multiple sources on distributed computational resources. It aims to establish relationships between different datasets from various organizations through similarity matrices and expertise to combine attributes from different classifications.
Knowledge Discovery in Environmental Management Dr. Aparna Varde
This is a research presentation by Aparna Varde during a summer research visit at the Max Planck Institute for Informatics, Saarbruecken, Germany in August 2015 within the research group of Dr. Gerhard Weikum, The presentation focuses on various aspects of data mining and knowledge discovery pertinent to environmental science and management. It encompasses three main topics: (1) decision support for the greening of data centers; (2) predictive analysis in urban planning and simulation; and (3) common sense knowledge for domain-specific KBs. It includes a few brief highlights on web and text mining in article / collocation error detection as well as in terminology evolution. This presentation is based on her relevant work as per August 2015, serving as an invited talk during this research visit.
GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...Cybera Inc.
Rob Fatland gave this presentation to the GeoCENS SSC Workshop on the current efforts, projects, and tools towards advancing environmental science in Banff, AB, September 23, 2010.
Semantically-Enabled Environmental Data Discovery and Integration: Demonstrat...Tatiana Tarasova
This presentation is about a framework for semantically-enabled data discovery and integration across multiple Earth science disciplines. Data harmonization was based on the principles of Linked Data. Previous works define the Data Cube extensions which are relevant to certain Earth science disciplines. To provide a generic and domain independent solution, we propose an upper level vocabulary (the ENVRI vocabulary) that allows us to express domain specific information at a higher level of abstraction.
From a human viewpoint we provide an interactive Web based user interface for data discovery and integration across multiple research infrastructures (http://portal.envri.eu). The system is demonstrated on a use case of the Iceland Volcano’s eruption on April 10, 2010.
The document discusses how empowering transformational science through open data access, optimized data formats, and open-source tools. It argues that traditional methods of accessing large datasets can be inefficient, with 80% of time spent on data preparation and only 10% on analysis. New approaches using analytics optimized data stores (AODS) like Zarr, and tools like Xarray and Dask, allow accessing large datasets with a single line of code and performing analyses within minutes by leveraging lazy loading and parallel computing. This represents a paradigm shift from traditional project timelines that can reduce barriers to science and increase reproducibility, empowering more researchers to efficiently analyze data and focus on scientific questions.
1) Data life cycles describe the stages data passes through from creation to obsolescence, including creating, processing, analyzing, preserving, accessing, and reusing data.
2) The document proposes modeling data life cycles and their relations to EUDAT services using the W3C PROV standard to track provenance.
3) A proof-of-concept service is being built to allow graphical representation of data life cycles, create life cycle plans and templates, and capture provenance during execution by filling templates.
This document discusses how open ecosystems can help science by storming the cloud. It argues that making data, software, and computing resources open through cloud-based platforms removes barriers and allows for more collaborative and interdisciplinary research at larger scales. This approach broadens participation in science and can accelerate progress by building on others' work.
The document provides an overview of the NGSP Group's research at Swinburne University of Technology, including three key areas:
1) Data management in cloud computing focusing on storage, placement, and replication strategies.
2) Performance management of scientific workflows, addressing temporal quality of service, constraints, monitoring, and violation handling.
3) Security and privacy protection in the cloud, particularly developing noise obfuscation techniques to protect indirect private information during normal cloud service processes.
The document discusses data publication at the National Snow and Ice Data Center (NSIDC). It provides statistics on the data ingested, archived, and distributed by NSIDC each year. It also discusses the importance of metadata for discovering, assessing, obtaining, understanding, and using data. Different levels of metadata and services are needed to support varying levels of data discovery, access, documentation, distribution and user support.
Your Research Data Management with the support of 3TU.DatacentrumAnnemiekvdKuil
The 3TU.Datacentrum offers researchers support with their data management and data storage. This presentation is given to researchers of Delft University of Technology.
Research data spring: a consortial approach to RDM within SaSJisc RDM
The research data spring project "A consortia-based approach to Research Data Management systems within small and specialist institutions" slides for the third sandpit workshop. Project led by CREST, Leeds Trinity University, Arkivum, and ULCC.
Aspects of Reproducibility in Earth ScienceRaul Palma
The document discusses aspects of reproducibility in earth science research within the European Virtual Environment for Research - Earth Science Themes (EVEREST) project. The key objectives of EVEREST are to establish an e-infrastructure to facilitate collaborative earth science research through shared data, models, and workflows. Research Objects (ROs) will be used to capture and share workflows, processes, and results to help ensure reproducibility and preservation of earth science research. An example RO is described for mapping volcano deformation using satellite imagery and other data sources. Issues around reproducibility related to data access, software dependencies, and manual intervention in workflows are also discussed.
Similar to Andreas Hense: Climate data for our future – acquired, analysed, archived (20)
The talk presents Thieme's activities in the context of Open Access and Open Data. The presentation also looks at the authors' und the users' perspective and Thieme's practical experiences with both.
Science publications used to have the joint function of keeping the 'Minutes of Science' as well as transferring knowledge. The sheer amounts of material published (2 new articles in PubMed every minute of every day) make comprehensive knowledge transfer via reading of articles virtually impossible. When the literature is open, though, much of the essential knowledge it contains can be distilled and the big picture obtained without having to read all the articles, so that reading can then be reserved for those key articles that give insight in the reasoning and argumentation that leads to consensus. The result is a much more efficient knowledge transfer that doesn't have to compromise on comprehensiveness.
Open Access is not only a model for organizing the production, distribution, and usage of knowledge but is also the expression of a new value system which is being developed in electronic environment. This system is mainly based on the concept of knowledge as a commons. Commons is the central concept of knowledge ecology. This is a still an unusual concept. Ecology in general is concerned with the sustainability of natural resources (for instance water, air/climate, forests) by protecting these resources from overuse, and knowledge ecology aims at the same objective of sustainability. But rather than making the immaterial resources of knowledge and information a scarce good (as is necessary with natural resources) sustainability of immaterial goods can only be achieved by the opposite, by open and free access and unrestricted use.
The central objective of knowledge ecology is to achieve the goal of people-centered, inclusive and sustainable knowledge societies. In the commons paradigm, a new consensus needs to be achieved concerning traditional concepts such as freedom of information and science, intellectual property, authorship and the nature of knowledge objects in general. Today, with the evident crisis of the market paradigm, not only in the finance markets, but also with respect to the disabling effects of commercialized information markets, with damaging effects not only for education and science and for the private consumer markets, but also for the innovation potential of the entire economy, there is a chance for a renaissance of the old idea of the commons, a renaissance of the primacy of common property rights as opposed to private property rights.
The concept of information ecology and in its context the idea of open access provides an alternative both to existing commercial publishing models on the international information markets and to international copyright regulations, which, in the last 20 years, have mainly emphasized the economic impact of knowledge and information. Neither the markets nor regulation by law have taken sufficiently into account the genuine character of knowledge as a common-pool resource. Information ecology does not object to the commercial use of knowledge produced in public environments such as universities and research centers, but suggests that publishing models are only acceptable when they acknowledge the status of knowledge as a commons, allowing free and open access for everyone. This commons must be based on sharing knowledge, producing new knowledge collaboratively, and providing future generations with the same access and usage rights.
Mr. Haank addresses Springer's position on Open Access. What has changed over the last years, what has stayed the same? Is hybrid developing into fully open, or will the models co-exist? He also touches upon the issue of (open) data. Making data available in a structured, useful way is much more complex than the current practice of article publishing.
Today libraries face more and new challenges when enabling access to information. The growing amount of information in combination with new non-textual media-types demands a constant changing of grown workflows and standard definitions. Knowledge, as published through scientific literature, is the last step in a process originating from primary scientific data. These data are analysed, synthesised, interpreted, and the outcome of this process is published as a scientific article. Access to the original data as the foundation of knowledge has become an important issue throughout the world and different projects have started to find solutions.
Nevertheless science itself is international; scientists are involved in global unions and projects, they share their scientific information with colleagues all over the world, they use national as well as foreign information providers.
When facing the challenge of increasing access to research data, a possible approach should be global cooperation for data access via national representatives:
* a global cooperation, because scientists work globally, scientific data are created and accessed globally.
* with national representatives, because most scientists are embedded in their national funding structures and research organisations.
DataCite was officially launched on December 1st 2009 in London and has 12 information institutions and libraries from nine countries as members. By assigning DOI names to data sets, data becomes citable and can easily be linked to from scientific publications.
Data integration with text is an important aspect of scientific collaboration. DataCite takes global leadership for promoting the use of persistent identifiers for datasets, to satisfy the needs of scientists. Through its members, it establishs and promotes common methods, best practices, and guidance. The member organisations work independently with data centres and other holders of research data sets in their own domains. Based on the work of the German National Library of Science and Technology (TIB) as the first DOI-Registration Agency for data, DataCite has registered over 850,000 research objects with DOI names, thus starting to bridge the gap between data centers, publishers and libraries.
This presentation will introduce the work of DataCite and give examples how scientific data can be included in library catalogues and linked to from scholarly publications.
The scientific and economic value of research data is enormous. To ensure successful subsequent usage, the scientific community needs efficient access to data, the data has to be reliable and persistent, and the quality of the data has to be proved.
One solution to these preconditions is to apply the techniques of today’s scientific publishing to research data. Besides its publication in a data repository together with some metadata, the data should undergo a transparent public peer-review using a publication platform.
The presentation discusses two approaches. On the one hand, the data can be the basis for a research article and undergoes a review parallel to the review of the manuscript. The data is then a reviewed supplement to a scientific publication. On the other hand, the data itself can be the subject of a publication whose quality is then assured by peers.
The presentation provides practical experience, especially with the latter strategy, realized through an established open access journal.
In scientific communication, we observe a complex interaction of several stakeholder groups, each of which have distinct interests, strategies and approaches for Open Access and Open Data. The German government initiated a “Commission for the Future of the Information Infrastructure” (KII) in Germany. In this commission, most of the stakeholders are working together in order to design a future scenario for the supply of scientific information. The KII’s evaluation and recommendations for Open Access as well as research data will be particularly highly recognized and will significantly influence Open Access and Open Data developments in Germany.
I will outline the current situation in Germany – players and their interactions in terms of Open Access and Open Data – and present two initiatives and their work in detail. One of them, the KII process, will show the official site, the other one will show the grassroots site of the story.
Research data as the main product of research can be unique and is often the result of a complex and cost-intensive research process. Reuse and reinterpretation of such material is envisioned, not only to maintain research integrity, but also to accelerate the advancement of science by sharing results in an early stage.
Generally speaking, there is little general experience with preservation, provision and publishing of research data. Thus so far little research has been done when it comes to researching data publishing models. In history, this has partly been due to the limited existing infrastructures, but with current information technologies, modern and tailored research data provision and publishing are facilitated.
Why tailored? Characteristics of research data vary across and within disciplines. This results in more complex prerequisites/specification when compared to the process of paper publication which is very similar across disciplines. Thus, tailored models are necessary to match the individual characteristics of research data across disciplines. Within this presentation three different approaches are distinguished: object centric, text centric and data centric. Prerequisites and limitations regarding timing and room of the data provision will be discussed and experiences with each of the different models presented.
Regardless of these models, it becomes apparent that due to the individual characteristics of research data, its provision and publication is only possible with the support and knowhow of the research community. This know-how needs to be linked to the competences of infrastructure facilities.
This presentation gives an overview of European Commission policies and initiatives aiming to promote open access to scientific information in the European Research Area (ERA). In this policy area, the Commission acts both as a policymaking and as a funding body. As policymaker, it defines policies within the context of European research and ICT policy. As a funding body, it lays down rules on access to the results of the research it funds within the Framework Programme for research development. This contribution introduces the European Commission's general approach regarding access to scientific information, presents specific initiatives in the field of open access to peer-reviewed scientific publications, and develops a first approach to open access to data.
Open access promises a great deal, but will your dataset be lost in cyberspace and remain unused? What are you doing to ensure discovery and utility for a range of publics? In this session you'll learn about the work done at OECD to make data both discoverable and useful to experts and lay users alike. You'll be introduced to OECD's citation tool and see some of the visualizations being developed to make data tell stories. You'll also see how OECD is presenting its datasets alongside books and journals in a single, seamless, service that is part of the global information network for researchers and students. Finally, Christmas comes early for anyone with an iPhone or iPad. Open Data? There's an OECD App for that.
Almost thirty studies have now been carried out to verify whether or not there is a citation advantage from Open Access. These will be reviewed and some conclusions drawn as to whether Open Access does produce such an advantage and what its nature is. In addition, other advantages - for authors, institutions and nations - from Open Access will be presented and discussed.
Open Access (OA) means free online access to the 2.5 million articles published every year in the world's 25,000 peer-reviewed scholarly and scientific research journals. OA can be provided in two ways: To provide "Green OA," authors self-archive the final refereed drafts of their articles in their institutional OA repositories immediately upon acceptance for publication (by conventional, non-OA journals). To provide "Gold OA," authors publish their articles in OA journals that make all their articles free online immediately upon publication. (Sometimes a fee is charged to the author's institution for Gold OA.) Because of the benefits of OA (in terms of maximized visibility, accessibility, uptake, usage and impact) to research, researchers, their institutions and the taxpayers that fund them, institutions and funders worldwide are increasingly mandating (i.e. requiring) Green OA self-archiving. Gold OA publishing cannot be mandated by authors' institutions and funders, but universal Green OA self-archiving mandates may eventually lead to a global transition to Gold OA publishing; it depends on whether and how long subscriptions remain sustainable as the means of covering the costs of print and online publication; if subscriptions become unsustainable, authors' institutions will pay journal publishers for peer review out of a portion of their annual windfall subscription cancellation savings. Data-archiving cannot be mandated, because researchers must be allowed the exclusive right to mine their data they have collected if they wish; but as Green OA self-archiving grows, data-archiving too will grow, because of their natural complementarity and the power of global collaboration to accelerate and enhance research progress.
There has long been a view that the outputs of publicly funded research should be publicly available. By this was meant research papers and findings, and it was not felt that publication in journals and monographs that were virtually unavailable at reasonable cost outside universities fully met this need. Open Access is not an attack on peer review or the scholarly publishing industry (although there are real concerns about escalating costs which can no longer be afforded by many universities).
The move to open data is driven by more complex arguments bound up by the need to be more open in demonstrating the uncertain nature of many scientific findings, and the need to manage research data more professionally, yet ensure sensitive or commercially valuable data can be kept secure.
This talk will explain the synergies and ambiguities between open policies and the individual drivers for career researchers and those of universities seeking to balance their responsibilities to society with commercial considerations.
This document discusses open access and open data from the perspective of a funder. It provides an overview of progress in the UK towards open access policies by research councils, universities, and funders. It also discusses the development of open access repositories and journals. For open data, it outlines benefits and drivers, as well as challenges researchers face in sharing data due to lack of incentives and resources. Further work is needed to provide guidance, integrate repositories, and promote strategic debate around open data policies and infrastructure.
More from "Open Access - Open Data" conference, 13th/14th December, 2010 (14)
Andreas Hense: Climate data for our future – acquired, analysed, archived
1. Acquired
Analysed
Archived
Climate Data
for Our Future
Prof. Dr. Andreas Hense
andreas.hense@h-brs.de
"Open Access – Open Data“
Expert Conference in Cologne
December 13, 2010
2. Project Partners
Bonn-Rhine-Sieg University Bonn University, Deutsches Klimarechen-
oAS, Computer Science, Meteorological Institute zentrum GmbH
Sankt Augustin Bonn Hamburg
Prof. Dr. Andreas V. Hense Prof. Dr. Andreas N. Hense Dr. Michael Lautenschlager
Professor for Business Professor for Climate Head of Data Mgmt,
Information Systems dynamics Director WDC for Climate
(WDCC)
Process definitions,
Project management & Experimental data & routines for technical QA,
software development routines for scientific QA hosting
13.12.10 Publication of Environmental Data 2
3. Agenda
Project Objectives
Meteorological Background
The World Data Center for Climate (WDCC)
The Publication System Atarrabi
13.12.10 Publication of Environmental Data 3
5. Location in the Scientific Process
Project focus
Analysis Long term Publication
Idea Preparation Execution
Local storage archival (DOI/URN)
13.12.10 Publication of Environmental Data 5
6. Data Publication in Research
Problem:
Publication and citation have always been common practice
for scientific articles.
Scientific articles are often based on data.
To check the results of an article or to do further research
the data are necessary.
Solution:
Publish the article AND the data.
13.12.10 Publication of Environmental Data 6
7. Aspects of Data Publication
Storage location – The volume of data can be huge (e.g. meteorological
data). Who can reliably store the data and assure long term availability and
fast access?
Formats – There can be various formats to represent data. Which (meta)
data format is the most commonly used?
Exposition/Registry – It is not sufficient to save data ”somewhere” on the
web. Scientists have to notice the existence of data. What is the best way
to expose data to search engines? Are there well known (domain specific)
catalogues where data can be registered?
Quality – Not all data are qualified for publication. What are the minimum
requirements? What are (scientific and technical) quality assurance
procedures?
Stability – Can data be changed after publication? How are new versions
published?
Identifier – How can data be referenced uniformly? Are there any
standards?
13.12.10 Publication of Environmental Data 7
8. Storage Sites
Experiment
Analyze Publish Expose
Collaborate
Storage Site local group institutional national (global)
(research workbench) (community portal) (national repository)
Data Visibility private group institution public
13.12.10 Publication of Environmental Data 8
9. A Common Scenario
Experiment
Analyze Publish Expose
Collaborate
copy
Storage Site local group institutional national (global)
(research workbench) (community portal) (national repository)
Data Visibility private group institution public
13.12.10 Publication of Environmental Data 9
10. The National Storage Solution
Experiment
Analyze Publish Expose
Collaborate
Storage Site local group institutional national (global)
(research workbench) (community portal) (national repository)
Data Visibility private group institution public
13.12.10 Publication of Environmental Data 10
11. Project Objectives
Definition of a standard procedure for publication of
observational data including documentation of quality
assurance actions.
Development of a web-based software system that leads
the researcher through metadata entering as well as
assists the publication agent to finalize the process.
Integration of the software system into the existing central
data repository for meteorology (Word Data Center for
Climate (WDCC)).
Generalisation of the defined process for other
environmental sciences.
13.12.10 Publication of Environmental Data 11
12. Meteorological
Data
13.12.10 Publication of Environmental Data 12
http://www.flickr.com/photos/nanagyei/4661459810/
13. Meteorological Data Sources
Climate Simulations Experimental data
Data from Models: grid-oriented, 2-3 Empirical Data: various structures in time
spatial- & 1 time-dimensions & 1 variable and space
dimension & 1 sampling/probability
dimension
Not so big amount but
Large amount, but simple structure much more complex
13.12.10 Publication of Environmental Data 13
15. Meteorological Data
We can distinguish
experimental (observational) data (small amount, heterogenous)
and
climate simulation data (huge amount, simple structure)
Experimental data Climate simulation data
Storage WDCC (work in progress) WDCC
location
Formats NetCDF (with restrictions, work in NetCDF
progress)
Exposition/ WDCC (work in progress), TIBORDER CERA catalogue, TIBORDER
Registry (work in progress)
Quality Project focus QA more technical than scientific
Stability No changes to primary data allowed, changes to metadata are restricted
Identifier Digital Object Identifier (DOI), Uniform Resource Name (URN)
13.12.10 Publication of Environmental Data 15
16. Relevant Meteorological Projects
Experimental data:
As part of the ”Quantitative Precipitation Forecast”
(DFG SPP1167):
Convective and Orographically-induced Precipitation
Study (COPS), measurements in the Black Forest in
2007, http://www.cops2007.de.
General Observation Period (GOP), extended
measurements in Central Europe in 2007,
http://gop.meteo.uni-koeln.de/gop/doku.php.
All participants have agreed to publish the data to
support further research.
13.12.10 Publication of Environmental Data 16
17. Relevant Meteorological Projects
Climate simulation data:
Coupled Model Intercomparison Project Phase 5 (CMIP5):
Standard experimental protocol for studying the output of
coupled ocean-atmosphere general circulation models
(GCMs)
Provides a community-based infrastructure in support of
climate model diagnosis, validation, intercomparison,
documentation and data access.
Addresses outstanding scientific questions that arose as
part of the IPCC AR4 (the Intergovernmental Panel on
Climate Change 4th Assessment Report) process.
Provides estimates of future climate change that will be
useful to those considering its possible consequences.
13.12.10 Publication of Environmental Data 17
18. The World Data Center
for Climate (WDCC)
13.12.10 Publication of Environmental Data 18
19. Long term archival
The WDCC in Hamburg, Germany operates large databases (60 PB)
for the long-term archival of data from climate simulation and
weather experiments.
WDCC is controlled by ”Deutsches Klimarechenzentrum” (German
climate data processing center)
Data production: 50 PB/year
Limit for mass storage archive: 10 PB/year
Data with expiration date
Limit for long-term data archive: 1 PB/year
Data without expiration date
Currently only a very small amount of data is published
(approx. 1,5 TB), this is expected to grow significantly.
13.12.10 Publication of Environmental Data 19
20. WDCC equipment
HPC Cluster ("blizzard")
IBM p575 "Power6" cluster
water cooled, 16 dual core CPUs per node, total: 264 nodes, 8448 cores
Total system peak performance: 158 TeraFlops/s
Top500: Rank 27 in 06/09
20 TeraByte memory
3 PetaByte GPFS file system (additional 3 PetaByte in 2011)
13.12.10 Publication of Environmental Data 20
21. HLRE2 Data Archive: HPSS
6 Sun StorageTek SL8500 tape libraries
10 000 media slots per library, 8 robots per library, 73 tape drives
total capacity: 60 PetaByte.
projected fill rate: 10 PetaByte/year
13.12.10 Publication of Environmental Data 21
22. The Publication
System Atarrabi
13.12.10 Publication of Environmental Data 22
http://www.flickr.com/photos/17258892@N05/2588347668/
23. Publication via DOI and URN
Experiments of particular importance can be published with
a DOI and a URN.
The decision making will take place at WDCC.
DOI and URN registration by ”TIB Hannover”.
Data is double-checked before publication (scientific and
technical quality assurance).
Most important is a complete and correct metadata record.
13.12.10 Publication of Environmental Data 23
24. System Context
Metadata database
at WDCC Hamburg
CERA2
Makes use of a
database
workflow engine
1 4
80%
0%
4 Atarrabi 2
DOI
URN
Publication
System
TIBORDER catalogue at
TIB Hannover Researcher
3
0%
Publication agent
at WDCC Hamburg
13.12.10 Publication of Environmental Data 24
25. Wizard-based Metadata Entering
Divide metadata fields into several logical units.
The user can leave the wizard at any time and
return later to continue.
General Spatial and temporal coverage Instruments
13.12.10 Publication of Environmental Data 25
27. Acquired
Analysed
Archived
Climate Data
for Our Future
Prof. Dr. Andreas Hense
andreas.hense@h-brs.de
visit us:
umwelt.wikidora.com
13.12.10 Publication of Environmental Data 27