This document provides an overview of policies for security and data sharing in the Physical Activity Location Measurement System (PALMS). PALMS aims to support data collection and analysis for exposure biology studies while being extensible, flexible, and HIPAA compliant. It discusses PALMS' logical architecture and policy composition, as well as the relationship between PALMS and the cancer Biomedical Informatics Grid (caBIG) framework. Key topics covered include identity management, access control policies, and integrating PALMS with caBIG services and tools for enforcing security policies in enterprise grids.
Overview of Policies for Security and Data Sharing in PALMS System
1. Overview of Policies for Security
and Data Sharing
Ingolf Krüger
Barry Demchak
March 16, 2010
2. Roadmap
• PALMS (Physical Activity Location Measurement System)
• SOA Review
• PALMS Logical Architecture
• Policy and its composition
• Policy execution – relationship with caBIG
Feel free to ask questions!
3. PALMS Objectives
• Support data collection and analysis for exposure
biology studies
– Data capture from multiple devices
– Multiple analyses and recombination of data
– Sharing of data between investigators and projects
– Support multiple visualizations (local and remote)
• Extensible and Flexible
– Scalable for large data flows
– Support large number of investigators and studies
– Customizable datasets, calculations, and
visualizations
• HIPAA Compliant and Secure
10. Composing Workflow and Policy
• Define and implement Policy Concerns
– A class of policy decision embedded in a workflow
– Characterized by a contract for workflow and dataflow
– Supports reasoning regarding application correctness,
completeness, and contradiction
– Instantiated as policies inserted by stakeholders at either
design time or runtime
If user in [“PIs”, “RAs”, “Guests”]
Continue
Else
Reply “Failure”
11. Groups and Roles
If user in [“PIs”, “RAs”, “Guests”]
Continue
Else
Reply “Failure”
• Internet2 Grouper
– Hierarchical group
management
– Single point of control
– Permission-based
administration
– Virtual organizations
(VOs)
12. Identity
If user in [“PIs”, “RAs”, “Guests”]
Continue
Else
Reply “Failure”
• Establishing
– What I have (token)
– What I know (password)
– What I am (biometric)
• Referencing
– Trust relationships
(certification authorities)
– X509 Certificate
– SAML Certificate
– OpenID
Browser Application
3
2
1
Certificate
4
5
User ID &
Password Confirm
ID Provider
13. caBIG
cancer Biomedical Informatics Grid
– Connects scientists & practitioners: shareable & interoperable infrastructure
– Develop standard rules & common language: easily share information
– Tools: collecting, analyzing, integrating, disseminating cancer information
– Cornerstones
– Federation
– Open development
– Open access
– Open source
– Workspaces
– Clinical Trial Management
– Integrative Cancer Research
– Tissue Banks and Pathology
– Vocabularies & Common Data Elements
– Architecture
– Strategic Planning
– Data Sharing and Intellectual Capital
– Training
14. caGrid & GAARDS
• Grid Authentication & Authorization with Reliably Distributed Services
– Services & Tools for enforcement of security policy in enterprise grid
– Developed on Globus Toolkit
– Provides
– grid user management
– identity federation
– trust fabric provisioning and management
– group/VO management
– access control policy management and enforcement
– credential delegation
– web SSO
– integration between security domains & grid security domain
16. Relationship to PALMS
• Pros
– Well supported
– caGrid Knowledge Center (Justin Permer/Ohio State
Bioinformatics)
– Professionally managed
– Well developed governance and development models
– Standards-based
– Security: X509 & SAML
– Ontologies: Thesaurus and Metathesaurus
– Sharing infrastructure
– Growing community
• Cons
– Key infrastructure out of our direct control
19. Composing Workflow and Policy
Scenario: Add Policy to Existing Workflow
(CNN | BBC) > story > if(authorized) > email(story,”x@ucsd.edu”)
• Key issues
– What is policy to compose?
– Where to insert policy? ... capture all paths?
– How to compose multiple policies?
– How to guarantee integrity of workflow?
– Preview: We have to address these
• Current methodologies
– Requirement discovery and hand coding
– Policy-based design & Inversion of Control
– Aspect Oriented Programming
– UML sequence chart composition
• New methodology (preview)
– ORCA
27. Use Case Attributes
• ID
• Name
• Priority
• Complexity
• Release Number
• Last Revised
• Description
• Actors (Primary and Secondary)
• Stakeholders
• Pre-Conditions
• Constraints
• Post-Conditions
• Triggers
• Cross References
• Flow of Events
– Basic Flow
– Alternative Flows
– Exceptions
• Extensions
• Information Requirements
• Special Requirements
• Frequency of Use
• Assumptions
• Issues and Considerations
– Issues
– Consideration
• Process Flows
• Related Use Cases
RA signs in
RA selects
study
RA uploads
.CSV and .GPX files
PALMS displays summary
RA confirms summary
PALMS commits
dataset
PALMS abandons
dataset
Display error
Display error
All files missing
or invalid
Time range
overlaps
accept decline
33. Service Interactions (Calculation)
alt
alt
Web
Browser
PALMS Study
Calculation
Engine
StartCalculation(study, protocolID, paramBlockID, resultName)
StartResult
- study
+ study
Start Calculation
Results
Repository
Initiate Result
AddResult(resultName, protocolID, paramBlock)
AddResult
Protocol
Repository
GetProtocolParams(protocolID, paramBlockID)
Get Param Block
ParamBlockResult
34. The Road ForwardComponent Interactions
Client Server Server
Google Web Toolkit
(GWT)
Mule Enterprise
Service Bus
35. PALMS Products
• Integration
– Mapping Engines
– Data Mining Engines
– Social Networks
– Disaster Management
• Alerts and Events
• Data Subscriptions
• Data Flow Analysis (provenance flow)
• Scalable and Configurable Calculations
• Collaboration
In the beginning: PIs have their studies, and their studies have their data, calculations, and visualizations
----- Insight: Studies can be managed centrally; calculations and visualizations can be reused; collaborations can occur with data, calculations, and visualizations
Click 1: Enter PALMS, an Internet-based facility for managing research
Click 2: The main features of PALMS: the study repository, calculation repository, and visualization repository
Click 3: Community uses PALMS to manage studies, provide calculations, and provide visualizations
Click 4: Policy -> HIPAA, Collaboration, etc
PALMS is a role-based system.
Data flows are associated with particular roles and particular targets
Click 1: A PI can define what data a study retains, what calculations can be made, and what visualizations can be made
Click 2: An RA can enter subject and observation information
Click 3: Once the information exists in the study, the RA can send it to a calculation engine, and then to a visualizer
Click 4: A guest cannot enter data, but can get calculations and visualizations
All data flows and requests are subject to policy (next slide)
Policy can be defined at both the PALMS system level and at the study level
Click 1: What is a policy?
Click 2: Who defines policies?? … it depends on the policy … (RAs can define policies that affect guests)
Click 3: An example: A guest wants to run a calculation and get a visualization
Click 4: Policy at both the PALMS and study level apply to allow/reject the operation, or to constrain or shape it
Important points:
- Policy can be used for access control and HIPAA enforcement.
- Policy engines monitor all transactions.
- Policy engines not only enforce permissions, but they also cause audit logging
- Engines similar to the policy engines can also perform encryption, anonymization, decimation, failure management, and so on
Current methodologies
Requirement discovery and hand coding
Policy-based design & Inversion of Control
Aspect Oriented Programming
UML sequence chart composition
New methodology (preview)
ORCA
Add Authentication policy into CNN/BBC workflow … see red decision-making
<<<<CLICK>>>>
In ORC, see the same decision being inserted
<<<<CLICK>>>>
<<Go over key issues>>
To solve the policy insertion problem, we have to solve these
<<<<CLICK>>>>
Show existing well-known solutions … not reactive to stakeholder policy insertion
<<<<CLICK>>>>
ORCA is part of solution … specifies WHAT and WHERE