SlideShare a Scribd company logo
Concise Preservation by combining Managed
Forgetting and Contextualized Remembering
Simona Rabinovici-Cohen
IBM Research - Haifa
WP 7 Presentation
Computational Storage Services
ForgetIT 1st Review Meeting, April 29-30, 2014
Kaiserslautern, Germany
WP Objectives
• Increase the value and outcome of preserved information over time
–Provide additional incentive for preservation
–Increase return-on-investment (ROI)
• Transform the generic storage service to a richer service with
potentially higher business value and automated preservation
processes
Focus of Year 1
• Build a consolidated platform for objects and computational
processes (storlets) that will be defined, triggered and executed
close to the data
• Utilize the OpenStack Swift open source for cloud storage
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Objectives of WP and Year 1 Focus
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Role in Preserve-or-Forget Architecture
Leveraged PDS and Storlet Engine adding:
Adapt Preservation Engine for ForgetIT
Rules mechanism
Storlets at interface proxy servers and local object servers
Multiple programming languages for storlets
New storlets:
image transformation storlet
fixity storlet
concept detection storlet
Searchable metadata contributions to OpenStack community
Integration with whole ForgetIT framework
Co-chair LTR group in SNIA to develop SIRF
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Achievements in Year 1
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Preservation DataStores (PDS)
PDS offloads some archiving
functionality to:
Decrease probability of data loss
Simplify the applications
Provide improved performance and
robustness
Supports automation of archiving
processes
Provides computational storage via
Storlet Engine
PDS was also storage infrastructure of EU research projects CASPAR and ENSURE
with partners: Europe Space Agency, Maccabi HMO, Tessella, Philips and more
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
PDS in OAIS
Functional Model
AIP
• OAIS is ISO standard reference
model for preservation
(ISO:14721:2002)
• Provide fundamental ideas,
concepts and a reference
model for long-term archives
• Archival Information Package
(AIP) - a logical structure for the
preservation object that needs
to be stored to enable future
interpretation
• Content Data Object (CDO) –
raw data to be preserved
PDS
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
DSpace and PDS
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
PDS Data Model
Docket
Costa Rica 2013
Docket
Edinburgh
Object (AIP)
Aggregation
Business Photos (silver)
Object (AIP)
Aggregation
Private Photos (gold)
Tenant
Peter Stainer
Hierarchical data model
Tenant Aggregation and Tenant Docket object (AIP)
Flexible organization of assets in collections with varied preservation policies (gold,
silver, bronze)
Aggregations support dynamic and transparent configuration of data management
Metadata:
aggregation=Private
Metadata:
aggregation=Business
Docket
Toy Conference 2014
Object (AIP) Object (AIP)
Aggregation
Press Releases (gold)
Tenant
Spielwarenmessen
Metadata:
aggregation=Press
Metadata:
aggregation=Press
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
The Need for Computational Object Storage
• “Data is the new Oil”
– In its raw form, oil has little value
– Once processed and refined, it helps power the world
• Data deluge of content depots and unstructured data
– Documents, medical images, photos, videos, etc.
– The fastest growing type of storage by volume
– Object storage is ideal for this type of data
• Object storage for content depots generally:
– Utilizes large bandwidth to serve big data over the WAN
– Uses server-based storage with under utilized CPUs
• Process and refine the data where it is stored
– Create a computational object storage with storlets
“Data is the new
oil.”
Clive Humby
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Client Value for Using Storlets
Reduce bandwidth – reduce the number of bytes transferred over the
WAN
e.g. Analytics storlet
Enhance security – reduce exposure of sensitive data
e.g. De-identification storlet
Save costs – consolidate generic functions that can be used by many
applications while saving infrastructure at the client side
e.g. Curation storlet
Support compliance – monitor and document the changes to the
objects and improve provenance tracking
e.g. Transformation storlet
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Storlet Engine Architecture
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Rules Mechanism
Enables automatic conditional invocation of storlets
Explicit storlet activation overrides implicit activation
Rules kept as per tenant editable object, with specified access
control
Configured by tenant, user, role, container, object,
content_type
Wildcards (“*”) allowed in a rule (high flexibility)
The first rule that matches the input is activated – prioritized
list of rules
Examples:
De-Identification (per Role)
Transformation (per Content Type)
Fixity (per docket)
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Storlets at proxy node and object node
L2 Rack Switch
1GB Ethernet
account node - SSD
L2 Rack Switch
1GB Ethernet
L3 Switch
10GB Ethernet
Virtual IP
L3 Switch
10GB Ethernet
container node -SSD
object node - HDD
object node - HDD
proxy nodeproxy node
Swift Object Node
object
service
Swift Proxy Node
Storlet Engine
proxy
service
Storlet Engine
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Fixity Storlet
16
• Papers
• S. Rabinovici-Cohen, E. Henis, J. Marberg, K. Nagin, “Storlet Engine: Performing
Computations in Cloud Storage”, to be submitted
• S. Rabinovici-Cohen, R. Cummings, S. Fineberg, “Self-contained Information
Retention Format For the, to be submitted
• Posters
• S. Rabinovici-Cohen (IBM), M. Baker (HP), R. Cummings (Antesignanus), S. Fineberg
(HP), E. Henis (IBM), "Self-contained Information Retention Format (SIRF) in
ForgetIT EU Project", 6th International Systems and Storage Conference (SYSTOR),
2013
• Other Dissemination Activities
• The Storage Networking Industry Association (SNIA) published in its March 2013
Newsletter that SNIA Long Term Retention group formed a liaison with ForgetIT
• The tutorial "Combining SNIA Cloud, Tape and Container Format Technologies for
the Long Term Retention of Big Data" is given at several SNIA conferences
• Deliverables
• D7.1: Foundation of Computational Storage Services
• D7.2: Computational Storage Services First Release
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Publications
Thank you for your attention!

More Related Content

What's hot

Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?
Olivier Dobberkau
 
Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)
Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)
Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)
ForgetIT Project
 
iRODS/Dataverse Project by Jonathan Crabtree
iRODS/Dataverse Project by Jonathan CrabtreeiRODS/Dataverse Project by Jonathan Crabtree
iRODS/Dataverse Project by Jonathan Crabtree
datascienceiqss
 
Lianjia data infrastructure, Yi Lyu
Lianjia data infrastructure, Yi LyuLianjia data infrastructure, Yi Lyu
Lianjia data infrastructure, Yi Lyu
毅 吕
 
Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...
OpenAIRE
 
NIF Data Federation
NIF Data FederationNIF Data Federation
iRODS: Interoperability in Data Management
iRODS: Interoperability in Data ManagementiRODS: Interoperability in Data Management
iRODS: Interoperability in Data Management
The HDF-EOS Tools and Information Center
 
2016 urisa track: nhd hydro linked data registery by michael tinker
2016 urisa track:  nhd hydro linked data registery by michael tinker2016 urisa track:  nhd hydro linked data registery by michael tinker
2016 urisa track: nhd hydro linked data registery by michael tinker
GIS in the Rockies
 
Filling the digital preservation gap
Filling the digital preservation gapFilling the digital preservation gap
Filling the digital preservation gap
Jisc
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...
Research Data Alliance
 
Scientific Computing and Visualization using HDF
Scientific Computing and Visualization using HDFScientific Computing and Visualization using HDF
Scientific Computing and Visualization using HDF
The HDF-EOS Tools and Information Center
 
Report of the Soil Data Facility
Report of the Soil Data Facility Report of the Soil Data Facility
Report of the Soil Data Facility
FAO
 
ArcGIS and Multi-D: Tools & Roadmap
ArcGIS and Multi-D: Tools & RoadmapArcGIS and Multi-D: Tools & Roadmap
ArcGIS and Multi-D: Tools & Roadmap
The HDF-EOS Tools and Information Center
 
Extending the OPD to cover RDM
Extending the OPD to cover RDMExtending the OPD to cover RDM
Extending the OPD to cover RDM
Jisc
 
DEEP general presentation
DEEP general presentationDEEP general presentation
DEEP general presentation
EUDAT
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Björn Backeberg
 
Improved Methods for Accessing Scientific Data for the Masses
Improved Methods for Accessing Scientific Data for the MassesImproved Methods for Accessing Scientific Data for the Masses
Improved Methods for Accessing Scientific Data for the Masses
The HDF-EOS Tools and Information Center
 
Open-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDFOpen-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDF
The HDF-EOS Tools and Information Center
 
Web-based On-demand Global NDVI Data Services
Web-based On-demand Global NDVI Data ServicesWeb-based On-demand Global NDVI Data Services
Web-based On-demand Global NDVI Data Services
The HDF-EOS Tools and Information Center
 

What's hot (20)

Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?
 
Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)
Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)
Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)
 
iRODS/Dataverse Project by Jonathan Crabtree
iRODS/Dataverse Project by Jonathan CrabtreeiRODS/Dataverse Project by Jonathan Crabtree
iRODS/Dataverse Project by Jonathan Crabtree
 
Lianjia data infrastructure, Yi Lyu
Lianjia data infrastructure, Yi LyuLianjia data infrastructure, Yi Lyu
Lianjia data infrastructure, Yi Lyu
 
Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...
 
NIF Data Federation
NIF Data FederationNIF Data Federation
NIF Data Federation
 
iRODS: Interoperability in Data Management
iRODS: Interoperability in Data ManagementiRODS: Interoperability in Data Management
iRODS: Interoperability in Data Management
 
2016 urisa track: nhd hydro linked data registery by michael tinker
2016 urisa track:  nhd hydro linked data registery by michael tinker2016 urisa track:  nhd hydro linked data registery by michael tinker
2016 urisa track: nhd hydro linked data registery by michael tinker
 
Filling the digital preservation gap
Filling the digital preservation gapFilling the digital preservation gap
Filling the digital preservation gap
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 
Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...
 
Scientific Computing and Visualization using HDF
Scientific Computing and Visualization using HDFScientific Computing and Visualization using HDF
Scientific Computing and Visualization using HDF
 
Report of the Soil Data Facility
Report of the Soil Data Facility Report of the Soil Data Facility
Report of the Soil Data Facility
 
ArcGIS and Multi-D: Tools & Roadmap
ArcGIS and Multi-D: Tools & RoadmapArcGIS and Multi-D: Tools & Roadmap
ArcGIS and Multi-D: Tools & Roadmap
 
Extending the OPD to cover RDM
Extending the OPD to cover RDMExtending the OPD to cover RDM
Extending the OPD to cover RDM
 
DEEP general presentation
DEEP general presentationDEEP general presentation
DEEP general presentation
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
 
Improved Methods for Accessing Scientific Data for the Masses
Improved Methods for Accessing Scientific Data for the MassesImproved Methods for Accessing Scientific Data for the Masses
Improved Methods for Accessing Scientific Data for the Masses
 
Open-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDFOpen-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDF
 
Web-based On-demand Global NDVI Data Services
Web-based On-demand Global NDVI Data ServicesWeb-based On-demand Global NDVI Data Services
Web-based On-demand Global NDVI Data Services
 

Similar to Computational Storage Services (WP7 ForgetIT 1st year review)

Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
Pascal-Nicolas Becker
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
SCAPE Project
 
Scalable Preservation Workflows
Scalable Preservation WorkflowsScalable Preservation Workflows
Scalable Preservation Workflows
SCAPE Project
 
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Cloudian
 
SCAPE Presentation at the Elag2013 conference in Gent/Belgium
SCAPE Presentation at the Elag2013 conference in Gent/BelgiumSCAPE Presentation at the Elag2013 conference in Gent/Belgium
SCAPE Presentation at the Elag2013 conference in Gent/Belgium
Sven Schlarb
 
Combinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificadaCombinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificada
Elasticsearch
 
Hunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout SessionHunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout Session
Splunk
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
Lviv Startup Club
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
Lviv Startup Club
 
Se training storage grid webscale technical overview
Se training   storage grid webscale technical overviewSe training   storage grid webscale technical overview
Se training storage grid webscale technical overview
solarisyougood
 
How Open Source Will Change How You Think about Storage - LGI Tech Summit
How Open Source Will Change How You Think about Storage - LGI Tech SummitHow Open Source Will Change How You Think about Storage - LGI Tech Summit
How Open Source Will Change How You Think about Storage - LGI Tech Summit
Scott Ryan
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
BigData_Europe
 
Repository Deposit Service Description
Repository Deposit Service DescriptionRepository Deposit Service Description
Repository Deposit Service Description
Julie Allinson
 
After summit catch up
After summit catch upAfter summit catch up
After summit catch up
Thanassis Parathyras
 
Matthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings FundMatthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings Fund
Tracy Kent
 
Webinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the EnterpriseWebinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the Enterprise
Storage Switzerland
 
OpenAIRE Open Innovation call: Next Generation Repositories
OpenAIRE Open Innovation call: Next Generation RepositoriesOpenAIRE Open Innovation call: Next Generation Repositories
OpenAIRE Open Innovation call: Next Generation Repositories
OpenAIRE
 
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
janaskhoj
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
Cisco DevNet
 
Application scenarios of the SCAPE project at the Austrian National Library
Application scenarios of the SCAPE project at the Austrian National LibraryApplication scenarios of the SCAPE project at the Austrian National Library
Application scenarios of the SCAPE project at the Austrian National Library
Sven Schlarb
 

Similar to Computational Storage Services (WP7 ForgetIT 1st year review) (20)

Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
 
Scalable Preservation Workflows
Scalable Preservation WorkflowsScalable Preservation Workflows
Scalable Preservation Workflows
 
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
 
SCAPE Presentation at the Elag2013 conference in Gent/Belgium
SCAPE Presentation at the Elag2013 conference in Gent/BelgiumSCAPE Presentation at the Elag2013 conference in Gent/Belgium
SCAPE Presentation at the Elag2013 conference in Gent/Belgium
 
Combinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificadaCombinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificada
 
Hunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout SessionHunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout Session
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Se training storage grid webscale technical overview
Se training   storage grid webscale technical overviewSe training   storage grid webscale technical overview
Se training storage grid webscale technical overview
 
How Open Source Will Change How You Think about Storage - LGI Tech Summit
How Open Source Will Change How You Think about Storage - LGI Tech SummitHow Open Source Will Change How You Think about Storage - LGI Tech Summit
How Open Source Will Change How You Think about Storage - LGI Tech Summit
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
 
Repository Deposit Service Description
Repository Deposit Service DescriptionRepository Deposit Service Description
Repository Deposit Service Description
 
After summit catch up
After summit catch upAfter summit catch up
After summit catch up
 
Matthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings FundMatthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings Fund
 
Webinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the EnterpriseWebinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the Enterprise
 
OpenAIRE Open Innovation call: Next Generation Repositories
OpenAIRE Open Innovation call: Next Generation RepositoriesOpenAIRE Open Innovation call: Next Generation Repositories
OpenAIRE Open Innovation call: Next Generation Repositories
 
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
 
Application scenarios of the SCAPE project at the Austrian National Library
Application scenarios of the SCAPE project at the Austrian National LibraryApplication scenarios of the SCAPE project at the Austrian National Library
Application scenarios of the SCAPE project at the Austrian National Library
 

Recently uploaded

Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 

Recently uploaded (20)

Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 

Computational Storage Services (WP7 ForgetIT 1st year review)

  • 1. Concise Preservation by combining Managed Forgetting and Contextualized Remembering
  • 2.
  • 3. Simona Rabinovici-Cohen IBM Research - Haifa WP 7 Presentation Computational Storage Services ForgetIT 1st Review Meeting, April 29-30, 2014 Kaiserslautern, Germany
  • 4. WP Objectives • Increase the value and outcome of preserved information over time –Provide additional incentive for preservation –Increase return-on-investment (ROI) • Transform the generic storage service to a richer service with potentially higher business value and automated preservation processes Focus of Year 1 • Build a consolidated platform for objects and computational processes (storlets) that will be defined, triggered and executed close to the data • Utilize the OpenStack Swift open source for cloud storage ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Objectives of WP and Year 1 Focus
  • 5. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Role in Preserve-or-Forget Architecture
  • 6. Leveraged PDS and Storlet Engine adding: Adapt Preservation Engine for ForgetIT Rules mechanism Storlets at interface proxy servers and local object servers Multiple programming languages for storlets New storlets: image transformation storlet fixity storlet concept detection storlet Searchable metadata contributions to OpenStack community Integration with whole ForgetIT framework Co-chair LTR group in SNIA to develop SIRF ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Achievements in Year 1
  • 7. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Preservation DataStores (PDS) PDS offloads some archiving functionality to: Decrease probability of data loss Simplify the applications Provide improved performance and robustness Supports automation of archiving processes Provides computational storage via Storlet Engine PDS was also storage infrastructure of EU research projects CASPAR and ENSURE with partners: Europe Space Agency, Maccabi HMO, Tessella, Philips and more
  • 8. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 PDS in OAIS Functional Model AIP • OAIS is ISO standard reference model for preservation (ISO:14721:2002) • Provide fundamental ideas, concepts and a reference model for long-term archives • Archival Information Package (AIP) - a logical structure for the preservation object that needs to be stored to enable future interpretation • Content Data Object (CDO) – raw data to be preserved PDS
  • 9. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 DSpace and PDS
  • 10. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 PDS Data Model Docket Costa Rica 2013 Docket Edinburgh Object (AIP) Aggregation Business Photos (silver) Object (AIP) Aggregation Private Photos (gold) Tenant Peter Stainer Hierarchical data model Tenant Aggregation and Tenant Docket object (AIP) Flexible organization of assets in collections with varied preservation policies (gold, silver, bronze) Aggregations support dynamic and transparent configuration of data management Metadata: aggregation=Private Metadata: aggregation=Business Docket Toy Conference 2014 Object (AIP) Object (AIP) Aggregation Press Releases (gold) Tenant Spielwarenmessen Metadata: aggregation=Press Metadata: aggregation=Press
  • 11. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 The Need for Computational Object Storage • “Data is the new Oil” – In its raw form, oil has little value – Once processed and refined, it helps power the world • Data deluge of content depots and unstructured data – Documents, medical images, photos, videos, etc. – The fastest growing type of storage by volume – Object storage is ideal for this type of data • Object storage for content depots generally: – Utilizes large bandwidth to serve big data over the WAN – Uses server-based storage with under utilized CPUs • Process and refine the data where it is stored – Create a computational object storage with storlets “Data is the new oil.” Clive Humby
  • 12. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Client Value for Using Storlets Reduce bandwidth – reduce the number of bytes transferred over the WAN e.g. Analytics storlet Enhance security – reduce exposure of sensitive data e.g. De-identification storlet Save costs – consolidate generic functions that can be used by many applications while saving infrastructure at the client side e.g. Curation storlet Support compliance – monitor and document the changes to the objects and improve provenance tracking e.g. Transformation storlet
  • 13. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Storlet Engine Architecture
  • 14. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Rules Mechanism Enables automatic conditional invocation of storlets Explicit storlet activation overrides implicit activation Rules kept as per tenant editable object, with specified access control Configured by tenant, user, role, container, object, content_type Wildcards (“*”) allowed in a rule (high flexibility) The first rule that matches the input is activated – prioritized list of rules Examples: De-Identification (per Role) Transformation (per Content Type) Fixity (per docket)
  • 15. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Storlets at proxy node and object node L2 Rack Switch 1GB Ethernet account node - SSD L2 Rack Switch 1GB Ethernet L3 Switch 10GB Ethernet Virtual IP L3 Switch 10GB Ethernet container node -SSD object node - HDD object node - HDD proxy nodeproxy node Swift Object Node object service Swift Proxy Node Storlet Engine proxy service Storlet Engine
  • 16. ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Fixity Storlet 16
  • 17. • Papers • S. Rabinovici-Cohen, E. Henis, J. Marberg, K. Nagin, “Storlet Engine: Performing Computations in Cloud Storage”, to be submitted • S. Rabinovici-Cohen, R. Cummings, S. Fineberg, “Self-contained Information Retention Format For the, to be submitted • Posters • S. Rabinovici-Cohen (IBM), M. Baker (HP), R. Cummings (Antesignanus), S. Fineberg (HP), E. Henis (IBM), "Self-contained Information Retention Format (SIRF) in ForgetIT EU Project", 6th International Systems and Storage Conference (SYSTOR), 2013 • Other Dissemination Activities • The Storage Networking Industry Association (SNIA) published in its March 2013 Newsletter that SNIA Long Term Retention group formed a liaison with ForgetIT • The tutorial "Combining SNIA Cloud, Tape and Container Format Technologies for the Long Term Retention of Big Data" is given at several SNIA conferences • Deliverables • D7.1: Foundation of Computational Storage Services • D7.2: Computational Storage Services First Release ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014 Publications
  • 18.
  • 19. Thank you for your attention!