Computational Storage Services (WP7 ForgetIT 1st year review)

Concise Preservation by combining Managed
Forgetting and Contextualized Remembering

Simona Rabinovici-Cohen
IBM Research - Haifa
WP 7 Presentation
Computational Storage Services
ForgetIT 1st Review Meeting, April 29-30, 2014
Kaiserslautern, Germany

WP Objectives
• Increase the value and outcome of preserved information over time
–Provide additional incentive for preservation
–Increase return-on-investment (ROI)
• Transform the generic storage service to a richer service with
potentially higher business value and automated preservation
processes
Focus of Year 1
• Build a consolidated platform for objects and computational
processes (storlets) that will be defined, triggered and executed
close to the data
• Utilize the OpenStack Swift open source for cloud storage
ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014
Objectives of WP and Year 1 Focus

Role in Preserve-or-Forget Architecture

Leveraged PDS and Storlet Engine adding:
Adapt Preservation Engine for ForgetIT
Rules mechanism
Storlets at interface proxy servers and local object servers
Multiple programming languages for storlets
New storlets:
image transformation storlet
fixity storlet
concept detection storlet
Searchable metadata contributions to OpenStack community
Integration with whole ForgetIT framework
Co-chair LTR group in SNIA to develop SIRF
Achievements in Year 1

Preservation DataStores (PDS)
PDS offloads some archiving
functionality to:
Decrease probability of data loss
Simplify the applications
Provide improved performance and
robustness
Supports automation of archiving
processes
Provides computational storage via
Storlet Engine
PDS was also storage infrastructure of EU research projects CASPAR and ENSURE
with partners: Europe Space Agency, Maccabi HMO, Tessella, Philips and more

PDS in OAIS
Functional Model
AIP
• OAIS is ISO standard reference
model for preservation
(ISO:14721:2002)
• Provide fundamental ideas,
concepts and a reference
model for long-term archives
• Archival Information Package
(AIP) - a logical structure for the
preservation object that needs
to be stored to enable future
interpretation
• Content Data Object (CDO) –
raw data to be preserved
PDS

DSpace and PDS

PDS Data Model
Docket
Costa Rica 2013
Docket
Edinburgh
Object (AIP)
Aggregation
Business Photos (silver)
Object (AIP)
Aggregation
Private Photos (gold)
Tenant
Peter Stainer
Hierarchical data model
Tenant Aggregation and Tenant Docket object (AIP)
Flexible organization of assets in collections with varied preservation policies (gold,
silver, bronze)
Aggregations support dynamic and transparent configuration of data management
Metadata:
aggregation=Private
Metadata:
aggregation=Business
Docket
Toy Conference 2014
Object (AIP) Object (AIP)
Aggregation
Press Releases (gold)
Tenant
Spielwarenmessen
Metadata:
aggregation=Press
Metadata:
aggregation=Press

The Need for Computational Object Storage
• “Data is the new Oil”
– In its raw form, oil has little value
– Once processed and refined, it helps power the world
• Data deluge of content depots and unstructured data
– Documents, medical images, photos, videos, etc.
– The fastest growing type of storage by volume
– Object storage is ideal for this type of data
• Object storage for content depots generally:
– Utilizes large bandwidth to serve big data over the WAN
– Uses server-based storage with under utilized CPUs
• Process and refine the data where it is stored
– Create a computational object storage with storlets
“Data is the new
oil.”
Clive Humby

Client Value for Using Storlets
Reduce bandwidth – reduce the number of bytes transferred over the
WAN
e.g. Analytics storlet
Enhance security – reduce exposure of sensitive data
e.g. De-identification storlet
Save costs – consolidate generic functions that can be used by many
applications while saving infrastructure at the client side
e.g. Curation storlet
Support compliance – monitor and document the changes to the
objects and improve provenance tracking
e.g. Transformation storlet

Storlet Engine Architecture

Rules Mechanism
Enables automatic conditional invocation of storlets
Explicit storlet activation overrides implicit activation
Rules kept as per tenant editable object, with specified access
control
Configured by tenant, user, role, container, object,
content_type
Wildcards (“*”) allowed in a rule (high flexibility)
The first rule that matches the input is activated – prioritized
list of rules
Examples:
De-Identification (per Role)
Transformation (per Content Type)
Fixity (per docket)

Storlets at proxy node and object node
L2 Rack Switch
1GB Ethernet
account node - SSD
L2 Rack Switch
1GB Ethernet
L3 Switch
10GB Ethernet
Virtual IP
L3 Switch
10GB Ethernet
container node -SSD
object node - HDD
object node - HDD
proxy nodeproxy node
Swift Object Node
object
service
Swift Proxy Node
Storlet Engine
proxy
service
Storlet Engine

Fixity Storlet
16

• Papers
• S. Rabinovici-Cohen, E. Henis, J. Marberg, K. Nagin, “Storlet Engine: Performing
Computations in Cloud Storage”, to be submitted
• S. Rabinovici-Cohen, R. Cummings, S. Fineberg, “Self-contained Information
Retention Format For the, to be submitted
• Posters
• S. Rabinovici-Cohen (IBM), M. Baker (HP), R. Cummings (Antesignanus), S. Fineberg
(HP), E. Henis (IBM), "Self-contained Information Retention Format (SIRF) in
ForgetIT EU Project", 6th International Systems and Storage Conference (SYSTOR),
2013
• Other Dissemination Activities
• The Storage Networking Industry Association (SNIA) published in its March 2013
Newsletter that SNIA Long Term Retention group formed a liaison with ForgetIT
• The tutorial "Combining SNIA Cloud, Tape and Container Format Technologies for
the Long Term Retention of Big Data" is given at several SNIA conferences
• Deliverables
• D7.1: Foundation of Computational Storage Services
• D7.2: Computational Storage Services First Release
Publications

Computational Storage Services (WP7 ForgetIT 1st year review)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Computational Storage Services (WP7 ForgetIT 1st year review)

Similar to Computational Storage Services (WP7 ForgetIT 1st year review) (20)

Recently uploaded

Recently uploaded (20)

Computational Storage Services (WP7 ForgetIT 1st year review)