Presentation of the paper:
Evaluating role mining algorithms. In Proceedings of the 14th ACM symposium on Access control models and technologies (SACMAT '09)
Ian Molloy, Ninghui Li, Tiancheng Li, Ziqing Mao, Qihua Wang, and Jorge Lobo. 2009.
ACM, New York, NY, USA, 95-104. DOI=10.1145/1542207.1542224 http://doi.acm.org/10.1145/1542207.1542224
1. Evaluating Role Mining
Algorithms
SACMAT’09, June 3 - 5, 2009, Stresa, Italy.
Ian Molloy, Ninghui Li, Tiancheng Li, Ziqing Mao, Qihua Wang
@ CERIAS Research Center Department of Computer Science, Purdue University
Jorge Lobo @ IBM T.J. Watson Research Center
Presentation by Onur Yılmaz - onur@onuryilmaz.me
3. Introduction
Aim of the study
Comprehensive study to compare role mining algorithms
What is presented?
Two new methods for generating datasets
Analysis of nine role mining algorithms
5. Overview
3 key points:
Output of a role mining algorithm
Criteria to compare outputs of algorithms
Input datasets
6. Overview
Output of Role Mining Algorithm
Existing algorithms based on their outputs:
Class 1: Outputting prioritized roles
Class 2: Outputting RBAC states
7. Overview
Output of Role Mining Algorithm
Class 1: Outputting prioritized roles
Prioritized list of candidate roles, each of which is a set of permissions
CompleteMiner and Fast-Miner
Candidate
role
generation
a set of candidate
roles from the userpermission assignment
data
Candidate
role
prioritization
8. Overview
Output of Role Mining Algorithm
Class 2: Outputting RBAC states
ρ = <User, Permission, UP >
RBAC state γ = <Roles, UserRoleAss,
RolePermissionAss,
RoleHierarchy, DirectUserPermissionAss>
9. Overview
Output of Role Mining Algorithm
Class 2: Outputting RBAC states
Minimize some cost measure while finding RBAC
output
Number of roles, number of user assignments
etc..
10. Overview
Output of Role Mining Algorithm
Class 2: Outputting RBAC states
Weighted Structural Complexity (WSC)
Sums up the number of relationships in an RBAC state, with
possibly different weights for each relationship.
11. Overview
Output of Role Mining Algorithm
Class 2: Outputting RBAC states
Weighted Structural Complexity (WSC)
Given a weight vector W = < wr,
wu, wp, wh, wd >
wsc(γ,W) = wr ∗ |R| + wu ∗ |UA| + wp ∗ |PA|+
wh ∗ |transitive_reduce(RH)| + wd∗ |DUPA|
12. Overview
Output of Role Mining Algorithm
Class 2: Outputting RBAC states
Weighted Structural Complexity (WSC)
Different weight vectors encode different mining objectives and
minimization goals
HierarchicalMiner takes both a configuration ρ and a weight vector and
aims at outputting an RBAC state with low WSC.
Graph optimization minimizes the number of edges
13. Overview
Output of Role Mining Algorithm
Class 1 vs Class 2 Algorithms
RBAC states are easy to compare
List of candidate roles can be more useful in practice
Administrator examines the role mining results and
determine whether to adopt some part of it.
In practice, whether role mining algorithms can suggest the
best candidate roles.
15. Overview
Metrics for Comparing Algorithms
Complexity of the RBAC state
Using WSC, how well each algorithm performs
under a variety of mining objectives
16. Overview
Metrics for Comparing Algorithms
Quality of Roles
For each weight vector W, evaluate the complexity of the optimal
RBAC state using only the top k roles.
Among the top k roles, how quickly do the mined roles cover the UP
relation?
Among the top k roles, how well do they «resemble» the original
roles?
19. Overview
Input Data Type
Generated Datasets
Random Data Generator
Tree-Based Data Generator
ERBAC Data Generator
20. Overview
Input Data Type
Random Data Generator
Permission
Role
User – Permission
Assignment
Roles
Number of Users,
Number of Roles,
Number of Permissions,
Users
Maximum Number of Roles for Users,
Maximum Number of Permissions for Role
21. Overview
Input Data Type
Tree-Based Data Generator
Assign
permissions to
nodes in the
tree
Assign users to
leaf nodes
Randomly
generate a tree
Number of Users,
Number of Permissions,
Height of Tree
Upper bound on number of children node,
Lower bound on number of children node
22. Overview
Input Data Type
ERBAC Data Generator
Permissions
Functional
Roles
Functional
Roles
Users
Number of Users,
Number of Business Roles,
Number of Functional Roles,
Number of Permissions
Business
Roles
Business
Roles
Maximum # of Business Roles,
Maximum # of Functional Roles,
Maximum # of Permissions
23. Role Mining Algorithms
Class 1
Class 2
CompleteMiner (CM)
ORCA
FastMiner (FM)
Graph Optimization (GO)
DynamicMiner (DM)
HP Role Minimization (HPr)
PairCount (PC)
HP Edge Minimization (HPe)
HierarchicalMiner (HM)
24. Role Mining Algorithms
CompleteMiner (CM)
Initial set of
roles
from user
permission sets
All possible
intersections
Prioritization
of roles
Candidate roles
Based on
number of
exact matches
Exponential Time
25. Role Mining Algorithms
FastMiner (FM)
Initial set of
roles
from user
permission sets
Only
intersection
between pairs of
initial roles
Prioritization of
roles
Candidate roles
O (n2m)
n: users, m: permissions
26. Role Mining Algorithms
DynamicMiner (DM)
CM and FM -> static prioritization (does not consider candidate
roles that been already chosen)
Initial set of
roles
from user
permission sets
All possible
intersections
Prioritization
of roles
Candidate roles
with the highest
priority first
O (n * |C| * min{n,m} )
C: Set of candidate roles
27. Role Mining Algorithms
PairCount (PC)
Newly proposed method
CM -> Prioritization based on exact numbers
In reality, multiple roles are assigned to a user
Pair Count: Pairs of users that share the only role, but no other
PC(P) = | { (ui, uj ) | ui = uj ∧ P(ui) ∩ P(uj) = P } |
O (n2m)
28. Role Mining Algorithms
PairCount (PC)
Initial set of
roles
from user
permission sets
All possible
intersections
Prioritization
of roles
Candidate roles
Based on Pair
Counts
O (n2m)
29. Role Mining Algorithms
ORCA
Hierarchical clustering on permissions
Set of clusters of
permissions
Find pairs of
clusters
The number
of users
having both
permissions is
the largest
Continue
until
One cluster
or
No user with
permissions in
two clusters
O (m2n)
30. Role Mining Algorithms
HP Role Minimization (HPr)
Minimal set of roles to cover the user-permission assignment
relation
Selecting the
next user with
the fewest
uncovered
permissions
Select a user u and finds a
pair <U(u), P(u)>
All user-permission assignments between
U(u) and P(u) are removed
This pair forms a
«role»
P(u): Permissions of user u
U(u): All users have all the permissions of u
O (nm)
31. Role Mining Algorithms
HP Edge Minimization (HPe)
Finding a RBAC state with minimal number of edges, called edge
concentration
Similar to Graph Optimization algorithm, except this does not create a
role hierarchy
HPr
Greedily
improve
objective
function
If two roles have
overlap in the
permission or
user sets ->
restructuring
Converge
O (k2m)
k : number of iterations
32. Role Mining Algorithms
HierarchicalMiner (HM)
Concept: < P, U > such that
U contains all the users that have all permissions in P,
P contains all the permissions that are shared by all users in U
Reduced
family of
concepts
Remove a role
if RBAC state
is improved
Removing a role:
- Redistribution of users
down the hierarchy
- Permissions up the
hierarchy
Heuristically
continue
Similar to Graph
Optimization but
uses concept
lattice.
33. Evaluation Results
For each dataset, each algorithm
Ranked according to their ability to optimize
evaluation criteria
1 to N
Two metrics mentioned before:
Comparing Complexity of the RBAC States
Comparing Prioritized Role Quality
35. Evaluation Results
Comparing Complexity of the RBAC States
Edge Concentration
HM has an advantage in this test because its roles are designed for a role-hierarchy
36. Evaluation Results
Comparing Complexity of the RBAC States
Allowed Noise at Direct Assignments
Dataset contains errors that should not be covered by roles.
37. Evaluation Results
Comparing Complexity of the RBAC States
Discovering Original Roles
Similarity of mined roles to original data
Used metric is average maximal Jaccard
HM: The top 40+ roles
are more or less the ones
generated
PC: Performed the worst,
generating roles farthest
from the original data
40. Analysis
Algorithms that minimize the number of roles often generate RBAC states
with a larger number of edges, resulting in increased complexity.
GO generates large role hierarchies when the number of users is greater
than the number of permissions.
DM is over-fitting some of the roles to cover users, and does not consider
the entire resulting RBAC state.
HM is computationally and memory intensive.
41. Conclusion
Aim of the study
Comprehensive study to compare role mining algorithms
What is presented?
Two new methods for generating datasets
Analysis of nine role mining algorithms
42. Future Work
Handling data with attribute information
In addition to the user-permission data, attribute
information may also be available.
Handling noisy data
In some scenarios, the input user-permission data
may contain noises.
43. Evaluating Role Mining
Algorithms
SACMAT’09, June 3 - 5, 2009, Stresa, Italy.
Ian Molloy, Ninghui Li, Tiancheng Li, Ziqing Mao, Qihua Wang
@ CERIAS Research Center Department of Computer Science, Purdue University
Jorge Lobo @ IBM T.J. Watson Research Center
Presentation by Onur Yılmaz - onur@onuryilmaz.me