Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI
Upcoming SlideShare
Loading in...5
×
 

Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

on

  • 406 views

The talk titled "Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI" given by prof. Amit Sheth at the ICMSE-MGI Digital Data Workshop held at Kno.e.sis ...

The talk titled "Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI" given by prof. Amit Sheth at the ICMSE-MGI Digital Data Workshop held at Kno.e.sis Center from November 13-14 2013. The talk emphasized important issues that material scientists encounter in publishing data - Provenance and Access Control.

workshop page: http://wiki.knoesis.org/index.php/ICMSE-MGI_Digital_Data_Workshop

Statistics

Views

Total Views
406
Views on SlideShare
406
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • A picture with 7 distinct stages to bring a material from the discovery research to product deployment.Discovery stage needs maximum coverage of data as much as possibleFor example, a designer may want to find all the property information about a material available
  • Data is spread all over the heterogeneous sources but inaccessbile to researchers and engineers: private lab info, a desktop, notebook, firewall To make it easy for everyone, a single access point to search for all publicly available information about materials?
  • For each organization like research lab, industry company, three kinds of data: private, selectively shared and public
  • Semantics Mappings
  • DefinitionExample
  • Data provenance is useful for many purposes.Example in the next slide
  • It’s crucial to capture various partial-ordered processes and their detail parameters (properties, compositions, predicted response, etc) as provenance of the output material product.Missing or inaccurate information of any important factor in the processes may result a different product, which may affect the verification, reproducibility, testing and trust.
  • RecapRDF Datasets can be intuitively represented as a graph with a set of resources connected by edges.This graph maybe replaced by another graph which describes an example in the material science project
  • To meet customized needs of different organizations
  • By capturing the access control primitive operators in processes1) A manager Y of a local component can grant access to individual users or a group of users.The Public group is dedicated to the entired federated system. Any resources granted to this Public group is available for everyone.2) Meanwhile, we are also able to track any access rights in the system.One important scenario may be, one manager Y suspects can ask why a suspectious user has access to an important resource.

Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI Presentation Transcript

  • Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI Amit Sheth and the team Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, OH-45435
  • Kno.e.sis’ MGI related projects • Federated Semantic Services Platform for Material Sciences (funded via AFRL/Rome) • Materials Database Knowledge Discovery and Data Mining (funded from AFRL/RX) Faculty: A. Sheth (PI), K. Thirunarayan (coPI), R. Srinivasan (coPI), Clare Paul (expert) Students: K. Gunaratna, M. Panahiazar, S. Lalithsena, V. Nguyen, N. Bryant, A. Shiveley, N. Jaykumar 2
  • 3
  • Databases Single Access Personal desktops Lab notebooks 4
  • Public-Private Data Sharing • Enhance publicly available datasets while retaining intellectual property data for businesses Private data and metadata (eg. ongoing experimental processes, intellectual property data) Selectively shared data and metadata (eg. with ongoing collaborators, licensed data) Public data and metadata (released products, material specifications) 5
  • Federated Architecture 1. User Authentication 2. Federated Semantic Query Processor 3. Semantics Mappings Federal Endpoint Private Shared Public AC Processor Semantic Query Processor Research Lab A Private Shared Public AC Processor Semantic Query Processor Industry Lab B Private Shared Public AC Processor Semantic Query Processor Organization C 6
  • Principles of a Federation • Each component controls access to its local data independently (local autonomy) • A query is decomposed to multiple sub queries, each sub query is executed at one component • Results from sub queries are combined by the federated query processor (control global access)
  • Provenance Metadata • Explains the origins of an artifact, such as – How was it created? – Who created it? – When was it created? • Example: for a given material X – Which processes and properties involved? – Input and output values of those processes? – Which research/engineering team performed the experiments?
  • Why Data Provenance? • • • • • • Verification Reproducibility Trust Testing Quality …
  • Product – Process – Product Output Input Processes Capturing provenance: Sufficient + Accurate => Reproduce the same output
  • A Unified Provenance Framework • Capturing domain-specific provenance – in addition to the W3C PROV ontology • Representing in standard RDF • Query engine for processing provenance queries • Operators for comparing artifacts’ provenance
  • Can we choose any part of our Semantic Web data to share with public community, or with selective collaborators ?
  • Semantic Web Data Subject Predicate Object A triple is in the format (Subject, Predicate, Object) An RDF Dataset is a set of triples
  • Linked Data Story So Far? Non-open data? Not there yet!
  • Can we choose any part of our Semantic Web data to share with public community, or with selective collaborators ?
  • Different levels of granularity – Individual resources • Example: a material product, a manufacturing process – Individual triples • Example: properties of a product, or process – Entire datasets Enable flexible selection of any data pieces to be shared at anytime
  • Can we choose any part of our Semantic Web data to share with public community, or with selective collaborators ?
  • Federal Endpoint User X of either Public group or Collaborators Manager Y of component A 1. Query Rewriting 2. AC-embeded Query Execution AC Processes Creating Resources Granting Permissions Local Component A Inferring Permissions
  • Various Policies • • • • Role-based Access Control (RBAC) Mandatory Access Control (MAC) Attribute-based Access Control (ABAC) Discretionary Access Control (DAC) 1. Which policy? Depend on the organization’s needs! 2. Our AC mechanism can be extended to support any of these policies
  • Summary • Semantic Federated Architecture enables us to – – – – Enhance the open data access Protect the confidential information Improve the communication between collaborating teams Support the reproducibility of material products with confidence and trust – Utilize the power of Semantic Web standards and technologies to do so more easily, effectively and flexibly
  • Kno.e.sis Thank you, and please visit us at http://knoesis.org/ Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, Ohio, USA 21