Be the first to like this
Data grids are an emerging technology that enables the formation of sharable collections from data distributed across multiple storage resources. The integrated Rule Oriented Data System (iRODS) is a data grid developed by the DICE Center at UNC-CH. The iRODS data grid enforces management policies that control properties of the collection. Examples of policies include retention, disposition, distribution, replication, metadata extraction, time-dependent access controls, data processing, data redaction, and integrity checking. Policies can be defined that automate administrative functions (file migration and replication) and that validate assessment criteria (authenticity, integrity, chain of custody). iRODS is used to build data sharing environments, digital libraries, and preservation environments. The iRODS data grid is used at UNC-CH to support the Carolina Digital Repository, the LifeTime Library for the School of Information and Library Science, data grids for the Renaissance Computing Institute (RENCI), collaborations within North Carolina, and both national and international data sharing. At RENCI, the TUCASI data grid supports shared collections between UNC-CH, Duke, and NCSU. The RENCI data grid is federated with ten other data grids including the National Climatic Data Center, the Texas Advanced Computing Center data grid, and the Ocean Observatories Initiative data grid. International applications include the CyberSKA Square Kilometer Array for radio astronomy and the French National Institute for Nuclear Physics and Particle Physics. The collections that are assembled may contain hundreds of millions of files, and petabytes of data. A specific goal is the integration of institutional repositories with the national data infrastructure that is being assembled under the NSF DataNet program. The software is available as an open source distribution from http://irods.diceresearch.org.