• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
50 data principles for loosely coupled identity management v1 0
 

50 data principles for loosely coupled identity management v1 0

on

  • 2,382 views

In the field of Identity and Access Management (IAM), Data is more important than Technology. A poorly designed data model can cause an IAM initiative to fail even with massive investments in ...

In the field of Identity and Access Management (IAM), Data is more important than Technology. A poorly designed data model can cause an IAM initiative to fail even with massive investments in technology products. Yet Data usually receives only superficial treatment, and many practitioners seem unaware of the basic principles to follow when designing Identity-based systems.

This presentation is a succinct summarisation of 50 data-related principles that an organisation overlooks at its peril.

Statistics

Views

Total Views
2,382
Views on SlideShare
2,381
Embed Views
1

Actions

Likes
1
Downloads
82
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NoDerivs LicenseCC Attribution-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    50 data principles for loosely coupled identity management v1 0 50 data principles for loosely coupled identity management v1 0 Presentation Transcript

    • “50 Data Principles for Loosely-Coupled Identity Management”Version 1.0, May 2013© Ganesh PrasadThis work is licensed under Creative Commons Attribution-No Derivs 3.0 Australiahttp://creativecommons.org/licenses/by-nd/3.0/au/
    • Table of ContentsThe Importance of the Data Model for Identity Management..............................................................................................................................................................4Domain, Entity, Identifier and Attribute................................................................................................................................................................................................ 5Illustration...................................................................................................................................................................................................................................... 6Identity Federation............................................................................................................................................................................................................................. 7Identity Life-cycle............................................................................................................................................................................................................................... 8Merging of Entity Identities............................................................................................................................................................................................................ 9Splitting of Entity Identities........................................................................................................................................................................................................... 10The Life-cycle of External Identifiers............................................................................................................................................................................................ 11Identity Establishment, Degree of Trust, Credentials and Authentication..........................................................................................................................................12Illustration.................................................................................................................................................................................................................................... 13The Life-cycle of Credentials....................................................................................................................................................................................................... 14Single Sign-On – A Poster Child of Access Management.................................................................................................................................................................16Access Management – Intra-Domain (Data Stores and Data Flows)...........................................................................................................................................17Access Management – Inter-Domain (Data Stores and Data Flows)...........................................................................................................................................18Identity Management and Provisioning – Intra-Domain....................................................................................................................................................................19Illustration.................................................................................................................................................................................................................................... 20Identity Management and Provisioning - Inter-Domain.....................................................................................................................................................................21Illustration.................................................................................................................................................................................................................................... 22Authorisation / Entitlements.............................................................................................................................................................................................................. 23Propagation of Role Identity........................................................................................................................................................................................................ 24Domain Models and the Identity of Domain Entities.........................................................................................................................................................................25Miscellaneous Aspects of Security................................................................................................................................................................................................... 26Summary and Conclusions.............................................................................................................................................................................................................. 27About the Author.............................................................................................................................................................................................................................. 27
    • The Importance of the Data Model for Identity ManagementOur previous publication (“Identity Management on a Shoestring”) introduced the concept of loose coupling in the domain of Identity and Access Management(IAM) and described in detail the architecture of a lightweight, low-cost, loosely-coupled corporate IAM system.With the advent of Cloud Computing, “Identity Management in the Cloud” has become a hot topic, and many standards have emerged to deal with the challengesof IAM at Web scale. The older XML-based trio of protocols – SAML for authentication, XACML for authorisation and SPML for provisioning – have achieved onlymixed success, with only SAML having gained some traction in enterprise systems. But in the new world of the cloud, more lightweight alternatives are indemand. The “OAuth2 family” of protocol standards promises to deliver what the industry is asking for.The OAuth2 protocol itself, as a means of enabling constrained delegated authorisation of access, is an excellent base technology to protect other IAM APIs(protocols), – specifically, such APIs as OpenID Connect for Authentication and SCIM for User Provisioning1.However, the OAuth2 family of protocols is by itself only a necessary and not a sufficient enabler. A well-designed protocol requires a flexible data modelunderpinning it, and both need to be based on sound principles.The most common mistake made by practitioners is the assumption that Identity Management is a technology problem. Regrettably, the transition from corporateIAM to cloud-based IAM has not changed that mindset. Solutions are almost always sought in the realm of technology products and protocols, and data modelsare relatively neglected. IAM projects therefore generally incur significant cost, but their record of success is not very encouraging. The irony is that identity-related problems can be effectively sorted out, and at far lower cost to boot, if the focus is placed where it belongs, i.e., on data. Identity management is primarilya data problem, not a technology problem.In this document, we will talk about important data principles and end by summarising the most common industry mistakes relating to data modelling.In a separate article (“Dont SCIM over Your Data Model”), we pointed out specific examples in the SCIM (System for Cross-domain Identity Management)provisioning standard where a data model based on certain principles could greatly improve the protocol. Here, we attempt to enunciate as many data-relatedprinciples as possible to support all the major aspects of Identity and Access Management. Some of these principles may appear trivial and obvious, while othersmay appear counter-intuitive or unnecessary. Nevertheless, we have found these principles useful in our own practice. They are unfortunately not commonlyapplied, which is why Identity and Access Management is often such a bugbear. We believe IAM (and not just Cross-domain Identity Management) will becomemuch simpler if this model is followed.1SCIM used to stand for “Simple Cloud Identity Management” but now stands for “System for Cross-domain Identity Management”, which is more generic.4
    • Domain, Entity, Identifier and AttributePrinciple 1 (All Identity is contextual): An Entitys Identity is not an absolute concept but is only meaningful within a Domain or a Universe of Discourse, i.e., withinsome well-defined area of interest. (“John” in a social Domain is a different Identity from “Mr. Smith” in a corporate Domain, and is different again from “dad” in afamily Domain. All of them could be the same person, but trying to distil some fundamental Identity that exists outside all these Domains is generally neitherfeasible nor worthwhile. Domain-specific Identities are more natural to model, and it is much more practical to simply associate these Domain-specific Identitieswith one another when required. In other words, an Identity (for any Entity) that is independent of a Domain or that exists outside of all Domains is a chimera.Principle 2 (Uniqueness is the essence of Identity): Domains may contain many types of Entities, and within each type, there could be many individual Entities, orinstances. These instances need to be distinguished from others of the same type, i.e., Entities (or more correctly, Entity instances) have unique Identity. Ifsomething does not have a unique Identity, then it is by definition not an Entity (e.g., “Value Objects” in Domain-Driven Design).Principle 3 (Identity is independent of Attributes): Entities have properties or Attributes with business meaning, and these Attributes are what make Entities“interesting” to the Domain in the first place. So it will seem counter-intuitive to assert as we do here that an Entity really has no inherent Attributes with businessmeaning! However, this principle aids flexibility by not imposing constraints. An analogy is the Standard Model of Particle Physics (recently validated through theconfirmation of the Higgs bosons existence). A core aspect of the Standard Model is that particles do not inherently have mass. Some of them acquire massthrough interaction with the Higgs field, which explains why other particles like photons have no mass. A loosely coupled model for Identity similarly decouples anEntity from all its meaningful Attributes. This is extremely useful in cases where anonymous Identity is to be tracked analogous to persistent cookies in browsers.Principle 4 (Identifiers must have no business meaning): The only inherent Attribute of an Entity is a unique and meaning-free Identifier that serves only todistinguish it from other Entities in the Domain. It is critically important that the identifier be free of all business meaning, because it will otherwise have to changewhen the business context changes. E.g., using a persons email address as their identifier would require changing the identifier when the user changes ISPs. Ameaningless string of characters is a much better identifier, because there would be no reason to change one for another. Identities are more “stable” whenIdentifiers are meaningless.Principle 5 (An Entity needs at least one “private” Identifier): An Entitys inherent Identifier attribute is private to the Domain in which the Entity is known and isnever visible outside the Domain. There can be additional identifiers that are made public and explicitly designated as External Identifiers (described underFederated Identity). However, revealing a private identifier to third parties creates dependencies that can prevent necessary housekeeping operations such assplitting and merging identities (as will be discussed in a later section). The existence of external (public) Identifiers does not obviate the need for a private one.Principle 6 (Attribute model): All Attributes of an Entity (including all its External Identifiers) are associated with the Entity through its Domain-private Identifier.Attributes may be simple (a name-value pair), multi-valued/repeating (a name with multiple values of the same type) or composite (a collection of dissimilarAttributes). Combinations of these are also possible, to any arbitrary level of complexity.Principle 7 (Every attribute value must have its own unique Identifier): For non-repeating Attributes, the attributes name is itself its unique Identifier within thenamespace of the Entity. This is true for simple attributes (e.g., “date-of-birth”) as well as composite attributes (e.g., “address.post-code”). Similarly, each elementof a multi-valued (repeating) Attribute must also be assigned a unique, meaning-free Identifier of its own (i.e., not a mere positional index). The reason for thisIdentifier will become clearer when we see how it can be used to support provisioning operations.5
    • Illustration6ExampleDomain: “Asia-Pacific Region”, “Small & Medium Businesses”,“Employees”, etc.Entity: “Customer”, “Product”, “Employee”, etc.Domain-private, meaning-free Identifier:“0d72c8ca-8b97-416c-9e2e-6f1695011e9e” (a UUID)External Identifier:“0800200c9a66” (some unique string)“ad28ce96-aea2-41f0-bdfd-c926682ad37f” (another UUID)Simple Attribute: Date of BirthRepeating Attribute: Email addresses – {UUID, addressstring}* (Note the per-value identifier, the UUID)Composite Attribute:Full name (Title, First name, Last name)Repeating, composite attribute:Addresses – {UUID, Address type (home/business), Streetnumber, Street name, Street type, Suburb, Post code, State,Country}* (Note the per-value identifier, the UUID)Domain XEntityComposite AttributeDomain-privateIdentifierSimpleAttributeAttribute 2 Attribute 4Attribute 1Attribute 3 Attribute 5Name ValueNNNNNVVVVVVIdentifierVIdentifierNRepeatingAttributeIdentifierVNVExternalIdentifier 1ExternalIdentifier 2Visible toDomain ZVisible toDomain Y
    • Identity FederationEntities must often be referenced outside their Domain. Some of these references may be to associate Entity instances in two or more Domains because theyrefer to the same logical Entity.Principle 8 (Federated Domain model): No Domain is the “centre of the universe”. All Domains are co-equal, and no Domain can dictate any aspect of an Entitysidentity within another Domain, although they may notify one another of Entity life-cycle events (i.e., creation, deletion/deactivation or a change in attributes).Domains cannot make assumptions about how other Domains will act on an Entitys life-cycle events.Principle 9 (Federated Identifier model): External Identifiers are special Attributes that also uniquely identify an Entity within a Domain (like “candidate keys” inrelational database tables). When Entity instances are associated across two Domains, the Domains should share a common External Identifier for the Entity andmap/associate this External Identifier to their respective Internal Identifiers. Neither Domain must see the others Internal Identifier for the Entity.7Staff Member as CustomerDomain: CustomerDomain-private Identifier:a224dadc-bb7b-4011-97ec-c8ba1336e65fAttributes: Customer category (Platinum, Gold, Silver), etc.External Identifier:f6302ada-aece-4869-aeba-43a735f081a3Domain: StaffDomain-private Identifier:c0e790e4-6466-43f3-8d52-253c3f75349bAttributes: Employee number, etc.External Identifier:f6302ada-aece-4869-aeba-43a735f081a3Same personDomain AEntity XDomain-privateIdentifier xAttribute(s)ExternalIdentifier qDomain BEntity YDomain-privateIdentifier yAttribute(s)ExternalIdentifier qExternalIdentifier pExternalIdentifier rSharedreference
    • Identity Life-cycleApart from the standard life-cycle events of creation, modification and deletion that apply to an Entity, there are events that are peculiar to its Identity. Theserequire special rules and special processes.Principle 10 (Identifiers are forever): The life-cycle of an Entitys Identifiers (both Domain-private and External) is independent of the life-cycle of the Entity itself.Once created, an Identifier exists in perpetuity. An Entity may cease to exist, but the Identifier must not be recycled, i.e., used to refer to another Entity thereafter.Identifiers are created to be used for one Entity only, and form a useful reference to historical data even after the deletion of the Entity. (Of course, an Entity canbe referred to by more than one Identifier, simultaneously as well as over the lifetime of the Entity.)There is an apparent exception to this principle. Within a Domain, it may sometimes be discovered that what were hitherto wrongly considered to be two or moreEntities are actually the same (a false negative), or that what was wrongly considered a single Entity is actually more than one (a false positive). It will benecessary in such cases to merge or split Identities. When doing so, it will be necessary to change the association between External Identifiers and Domain-private Identifiers, which may seem like a violation of the above principle. However, in spirit, the Identifiers still refer to the same Entity. Only the previous modelof the Entity was incorrect. With a suitable audit trail to document the merge or split, historical attributes and events can still be related back to an Entity.An important effect of having External Identifiers mapped to Domain-private Identifiers is that any splitting or merging of Entities that is necessitated by therequirement to fix errors (false negatives and false positives) within a Domain need not be exposed immediately to external Domains. It is possible for themapping to hide these changes and have them communicated at a more convenient time, and only if necessary.Principle 11 (Merging of Identity): Merging two or more Entity Identities should be done by dropping all but one Domain-private Identifier. All the ExternalIdentifiers of the other Identities should now be associated with the sole remaining Domain-private Identifier. All the various attributes of those Identities shouldnow be associated with this single Domain-private Identifier, merging and reconciling them as required. E.g., if two Entities referred to by the External Identifiers“J Smith” and “John Smith” are recognised to be one and the same, they must now have the same Domain-private Identifier, although the External Identifiers “JSmith” and “John Smith” may continue to exist, with both now associated with the same Domain-private Identifier. If John Smith has two valid phone numbers,and each of his previous Identities was associated with one, the merged Identity should have both. If the two earlier Identities had different addresses and onlyone turned out to be valid, then the discrepancy should be reconciled by dropping the invalid one and associating the remaining with the Domain-private identifier.Other Domains relying on these External Identifiers will not (and should not) be impacted by the clean-up within this Domain.Principle 12 (Splitting of Identity): Splitting Entity identities should be done by creating new Domain-private Identifiers for the Entities that have to be created outof the original one. Existing Attributes should be distributed among the new Entities by associating them with the appropriate Domain-private Identifiers. In somecases, Attributes may apply to more than one of the Entities, and will need to be duplicated. New External Identifiers may be created for the newly createdEntities and exposed to other Domains if and when required. E.g., if it is discovered that an Entity with an External Identifier of “John Smith” actually refers to twodifferent people, then only one of them should be allowed to carry this Identity going forward. The other should be given a new Identity (a new Domain-privateIdentifier) and can be gradually introduced to other Domains through a new External Identifier, (say) “John G Smith”. If the original Entity had a “Date of Birth”attribute with the value “1 Jan 1970”, and this was found to be valid for both the Entities after the split, then the attribute should be duplicated and applied to boththrough association with their respective Domain-private Identifiers.8
    • Merging of Entity Identities9Entities X and Y are discovered to be the same (a “false negative”). The two Entity identities should be merged.Domain AEntity XDomain-privateIdentifier xAttributesof XEntity YDomain-privateIdentifier yAttributesof YExternalIdentifier pExternalIdentifier rVisible to Domain B Visible to Domain BDomain AEntity XDomain-privateIdentifier xMergedattributesof EntitiesX and YEntity YDomain-privateIdentifier yExternalIdentifier pExternalIdentifier rNo longer existsNo longer referenced andnever recycled (i.e., neverused for another Entity)Visible to Domain BVisible to Domain B(No change detected)Domain B can be notified of the mergeat a convenient later time, or not at all.
    • Splitting of Entity Identities10“Entity X” is discovered to be representing two Entities (a “false positive”). The Entity identity should be split.Domain AEntity XDomain-privateIdentifier xAttributesof XExternalIdentifier pVisible to Domain BDomain AEntity XDomain-privateIdentifier xAttributesrightfullybelongingto XEntity YDomain-privateIdentifier yExternalIdentifier pNew ExternalIdentifier rVisible to Domain B Made visible to Domain Bwhen requiredDomain B can be notified of the split at a convenient later time, or notat all. In some cases, Domain B may need to be notified immediately.Attributesfrom X that rightfullybelong to Y
    • The Life-cycle of External IdentifiersManaging the life-cycle of Domain-private Identifiers is relatively easy, as we have shown, because there are few (if any) external dependencies on them.However, the same cannot be said about External Identifiers, which by definition, are visible externally and therefore have external dependencies.Example: Two independent, legacy applications with their own user stores need to be gradually decommissioned in favour of a new, integrated application. Inother words, two legacy Domains are to be merged into a third one. (The legacy Domains may continue operating in parallel for a while, provisioning new usersusing independent legacy logic even as they are being migrated.) The treatment of Domain-private Identifiers is relatively straightforward as described in theprevious section. If UUIDs are used in the legacy Domains, it may even be possible to migrate them unchanged without conflict since UUIDs share a universalnamespace to start with. However, conflict is possible between External Identifiers because two namespaces are being collapsed into one. Hence specificstrategies are required to manage these conflicts.Principle 13 (Maintain visibility of changes to External Identifiers): External parties that have built a dependency on External Identifiers in legacy Domains mustunderstand that a migration is taking place and that a change in their Identifiers may be required. Setting expectations is key, because a seamless migration (i.e.,no change to any External Identifier) is not realistic.Principle 14 (Determine a conflict resolution protocol): In the interest of least disruption, a protocol to resolve conflicts should be agreed before migration (e.g.,“first come, first served” or “Domain 1 over Domain 2”). Only one claimant to an External Identifier that migrates to the new, merged namespace can be grantedthe right to use the Identifier. Other claimants will be required to choose another Identifier. The converse is not true. When false negatives are detected andmerged into a single Entity, it can still retain all its External Identifiers.Principle 15 (Handle false conflicts): Although it is possible that External Identifiers from two or more legacy Domains will conflict in the merged namespace of thenew Domain, this may simply be because they refer to the same logical Entity. This is a special case of a false negative, and will need to be detected as part ofthe migration process along with other false negatives. Their Domain-private Identifiers and attributes will need to be merged, of course, but their commonExternal Identifier can be preserved without change because it relates to the same logical Entity.11Legacy Domain AEntity XDomain-privateIdentifier xExternalIdentifier pMigrationNew Domain Q Legacy Domain BEntity YDomain-privateIdentifier yExternalIdentifier pEntity XDomain-privateIdentifier xExternalIdentifier pEntity YDomain-privateIdentifier yExternalIdentifier pMigrationUsethe splittechniquein case ofconflictThis conflict isharder to reconcile
    • Identity Establishment, Degree of Trust, Credentials and AuthenticationPrinciple 20 (Identity Establishment): Identity Establishment is a more general concept than Authentication. Identity Establishment within a Domain is simply thedetermination of an Entitys Domain-private Identifier through any means. Identity Establishment is the pre-requisite for most meaningful business functionsinvolving the Entity, since all Attributes are associated with the Domain-private Identifier.Principle 21 (Trust model): The nature of the means through which an Entitys Identity is established determines the Degree of Trust associated with the Identity.A mere reference to an Entitys Domain-private Identifier is sufficient to establish the Identity of the Entity that is being discussed, but the Degree of Trust in theaccuracy of that reference could vary based on how it came about.Principle 22 (Authentication): If a “sufficiently high” Degree of Trust is achieved in the establishment of an Entitys identity, “Authentication” can be said to havetaken place. In other words, the assertion or claim that an Entitys identity is such-and-such is accepted. How high the Degree of Trust needs to be in order to beconsidered “sufficient” varies with the Domain, hence Authentication is a more fuzzy concept than Identity Establishment. In practice, Authentication occurs either(1) when the Entity itself presents one of its External Identifiers along with one or more matching (security) tokens that other Entities are unlikely to possess orhave knowledge of, or (2) when an External Identifier for the Entity is propagated from another (trusted) Domain in a tamper-proof way. In either case, theDomain-private Identifier for the Entity can then be determined with a high Degree of Trust through its association with the External Identifier that is presented.Authentication implies Identity Establishment but not the other way around.Principle 23 (The Degree of Trust is not static): The Degree of Trust can depend on a variety of factors, such as the strength of the security tokens presented tovalidate an Identifier, the channel through which the access takes place, the time elapsed since the Identity was established, etc. The strength of the securitytokens is important since some sensitive functions may require stronger tokens to be presented when they are accessed. The channel is important because theIdentities of users from a trusted network or using a trusted device are established with a higher Degree of Trust than those from untrusted networks or devices.(Of course, a conservative model is to treat all networks and devices as untrusted.) The time elapsed is important since the Degree of Trust can “decay” withtime, requiring periodic renewal. This latter aspect clearly distinguishes Identity Establishment from Authentication, since the Identity continues to be valid, but adegradation in the Degree of Trust may require re-Authentication.Principle 24 (Credentials): The combination of an Identifier and a security token is referred to as a set of Credentials. A security token may be associated with aspecific External Identifier or with the Domain-private Identifier for the Entity. The latter association can provide greater flexibility in certain cases, such as whenallowing a customer to identify themselves in a variety of ways but requiring them to remember only one password.Principle 25 (Identity Establishment and Authentication can occur in any sequence): Identity Establishment and Authentication can occur in different domains andin any arbitrary sequence. The Degree of Trust in each domain can also vary accordingly. If Authentication is done in one Domain and the Identity is propagatedsecurely to another Domain, then Identity Establishment occurs after Authentication. The Degree of Trust could be high in both Domains. If one Domain merelypropagates an Identity to another, and the second Domain establishes the Identity and then performs Authentication, then the order is reversed. The Degree ofTrust may only be high in the second Domain.12
    • Illustration(1) The Domain receives a set of Credentials in the form of an Entitys External Identifier p and a Security Token that is associated with that specific Identifier. Ifthe token matches the value stored, then the Entity is deemed authenticated and its Identity is established to be the Domain-private Identifier x, with a highDegree of Trust.(2) and (3) The Domain receives a set of Credentials in the form of an Entitys External Identifier (r or s) and a Security Token that is associated with the EntitysDomain-private Identifier x. If the token matches the value stored, then the Entity is deemed authenticated and its Identity is established to the be Domain-private Identifier x, with a high Degree of Trust. What distinguishes the latter two cases from (1) is that all External Identifiers share a single Security Token,which makes security easier to manage, both for the Domain and for the Entity. E.g., a customer may present either a user name or a primary email address astheir External Identifier, but only have to remember a single password. Changes to passwords then also apply uniformly to all Identifiers.(4) The Domain receives an Entitys External Identifier q from a trusted external Domain in a way that is guaranteed against tampering. Although no associatedSecurity Token is received, the tamper-proof assertion of the External Identifier is itself considered a Security Token. There is a high Degree of Trust in theassociation of the External Identifier q to the Domain-private Identifier x, and the Entity is taken to be Authenticated.(5) The Domain receives an Entitys External Identifier t through a less trusted mechanism. The Entitys Identity can be established to be Domain-privateIdentifier x, but the Degree of Trust associated with this Identity is lower. It may be sufficient for certain purposes but not for others. Establishing a users Identityby validating a weak surrogate such as their mobile number falls under this category.13Trusted DomainEntityDomain-privateIdentifier y“My” DomainEntityDomain-privateIdentifier xExternalIdentifier pExternalIdentifier rIdentifier-specificSecurity TokenExternalIdentifier sDomain-wideSecurity Tokenfor EntityEntity-ScopedCredentialsEntity-ScopedCredentialsIdentifier-ScopedCredentialsExternalIdentifier qExternalIdentifier qExternalIdentifier t12345
    • The Life-cycle of CredentialsThe life-cycle of Credentials is related to that of External Identifiers. Domains that merge often have to merge Credentials as well, and this presents furtherproblems.Principle 16 (Credentials are for the Entity, not the Identifier): In legacy Domains with a relatively simplistic Identity model, Credentials are usually associateddirectly with External Identifiers, and there is no separate internal Identifier. In the new Domain, Credentials should be associated with the Domain-privateIdentifier rather than with any External Identifier (See following section on Authentication). This can also prove friendlier to users when Domains are merged,because they can still use any of the External Identifiers they are used to, but only have to remember a single password, for example.Principle 17 (Limit the number of Credentials per Domain): A Domain should standardise upon one set of Credentials of the appropriate strength and discard allothers. External parties with a dependency on these Credentials need to be notified of the consolidation and allowed to select the one they want to retain. In rarecases, more than one set of Credentials may need to be stored, such as to enforce “step-up” Authentication when accessing more sensitive functions.Ideally, Credentials must be stored in only one place, without duplication. Sometimes, due to legacy dependencies, it may not be possible to consolidate thestorage of Credentials. In that case, their values should be harmonised across physical stores using MDM principles.Principle 18 (Employ Master Data Management to keep redundant Credentials consistent): Updates to credentials should follow MDM principles (Also discussedunder Provisioning). One store should be designated the “source of truth”, i.e., the only one updated by business logic. All others should be treated as read-onlyreplicas and robustly refreshed from the source of truth, either in real-time or using an eventual consistency model.14
    • Principle 19 (Robustness): (Caution: This principle trades off security for user-friendliness.) The principle of robustness in communications is to “be conservativein what you produce and liberal in what you consume”. In similar fashion, IAM should be conservative while provisioning (i.e., provisions to ALL stores usingMDM) and liberal while authenticating (accepts validation against ANY store).How robustness is ensured: When a user changes their password on the source of truth, the expected replication to secondary stores may fail. Normally, thiswould pose a problem if there are applications that authenticate against these secondary stores. However, Authentication will still succeed because all stores arechecked and a successful validation against any one of them is taken as successful authentication, hence replication failures will not impact users and the systemas a whole is robust.However, note that this approach also successfully authenticates the previous password. If the password had had to be changed because it was compromised,then this is clearly a security loophole. Hence this approach must be used with caution.15Credential Store 1 Credential Store 21. Initial Provisioning2a. Changeof password2b. Replication(may fail)3a. Authentication with new password(Step 1 - fails)3b. Authentication with new password(Step 2 - succeeds)
    • Single Sign-On – A Poster Child of Access ManagementPrinciple 26 (Single Sign-On): Single Sign-On (SSO) is defined in different ways, but the most general one is that a user attempting to access a related group ofsystems (i.e., an SSO environment) is challenged for their credentials only once per “session”, whatever those credentials may be and however that session maybe defined (See Single Sign-Out below). There could be multiple sets of credentials that could be used, but only one set is required for a given session. Therecould be multiple initial entry points into the SSO environment, and a different one could be used in each session, and each such entry point may require adifferent set of credentials, but once the challenge has been successfully met, no further challenge is made for the rest of the session. Hence this is the definitionthat covers all use cases including federation.Principle 27 (SSO Token model): The data model to support SSO is largely independent of the core Identity model and consists of a set of tokens. There is asingle token for SSO itself, which ensures that a second challenge does not occur within the same session. There is a second set of tokens that are each specificto one particular application or resource that is being accessed. Authentication is sufficient to grant the SSO token, but Authorisation rules are consulted for eachapplication or resource for which access is being attempted, and only successful authorisation will result in the grant of the application/resource access token.Both tokens are required before a user can access an application or resource in an SSO environment.Principle 28 (SSO and Identifiers): When SSO covers a single Domain, the Domain-private Identifier for an Entity can be freely exchanged inside tokens passedbetween systems as part of the SSO protocol. When SSO covers multiple Domains (“federated SSO”), appropriate mapping to External Identifiers must beperformed and only External Identifiers may be communicated between Domains as part of the SSO protocol (e.g., within SAML tokens). Once Authenticationcompletes, Identity is established, i.e., the Domain-private Identifier is known to the SSO system. Either this or a suitable External Identifier associated with itshould be populated within the appropriate tokens and passed to participating (trusted) systems, depending on whether they are within the Domain or in externalDomains.Principle 29 (SSO and Coarse-grained Authorisation): The SSO system should only perform coarse-grained authorisation checks (explained in detail later).Coarse-grained Role Identifiers may be populated within application access tokens passed to applications in addition to the users Identifier (Domain-private orExternal). The application itself should perform fine-grained authorisation checks using its knowledge of the users Identifier. This keeps the SSO systemmanageable in terms of logistical complexity, especially as the number of protected applications increases, because each application will have its own specialisedroles and access rules.SSO has a number of side-effects, some of which may be seen as undesirable. If an application that is sensitive from a security viewpoint shares an SSOenvironment with other applications that are not sensitive, then there is a potential risk, because it is not possible to “log out” of the sensitive application if any ofthe other applications is being used. The SSO session is deemed to be active, and it will be possible to acquire unchallenged access to the sensitive applicationat any time, since a second challenge is never issued during an SSO session.Principle 30 (Ending SSO Sessions): As far as possible, sensitive and less sensitive applications should not be combined within the same SSO environment.Extra protection such as decay in the Degree of Trust requiring fresh authentication may be employed. This bends the definition of SSO a little by requiring afresh challenge during a session but provides greater security for sensitive apps. An alternative is “Single Sign-Out”, under which logging out of a sensitiveapplication ends the SSO session altogether and requires fresh authentication before access is granted again to any system. It may be argued that Single Sign-Out defeats the purpose of Single Sign-On.16
    • Access Management – Intra-Domain (Data Stores and Data Flows)17DomainApplication A Application BAuthenticationData StoreCoarse-grainedAuthorisationData StoreSSO ServerID mappingData StoreClient Application(Browser)Fine-grainedAuthorisationData Store AFine-grainedAuthorisationData Store BTokenData StoreUser AttributeData StoreLogical IAMData StoresTemporary store forSSO Token andApplication Access Tokens Credentials:External ID + Security Token(s) ORDomain-private ID + Security Token(s)- Domain-private ID toExternal ID(s)- Domain-private ID toCoarse-grained Roles- Roles to ApplicationAccess- Domain-private ID toother Entity Attributes- Domain-private ID to Fine-grained Roles- Roles to Function Access- Domain-private ID to Data Access (Rules)Credentials:ID + Security Token(s)SSO Token+ ApplicationAccess Token(s)Application Access Token A+ Domain-private ID+ Coarse-grained Roles+ User attributesApplication Access Token B+ Domain-private ID+ Coarse-grained Roles+ User attributesAuthenticationCoarse-grainedAuthorisationUser attributeretrievalFine-grainedAuthorisationFine-grainedAuthorisationStore / Retrieve
    • Access Management – Inter-Domain (Data Stores and Data Flows)18Domain YDomain XApplication AClient Application(Browser)Application PSSO Server XIAMData Store XCredentialsTokensApplication Access Token A(includes Domain-privateIdentifier as attribute)Application Access Token P(includes External Identifieras attribute)(Because of the focus ondata, not all flows of thefederated SSO protocol areshown)
    • Identity Management and Provisioning – Intra-DomainData pertaining to an Entity may be distributed across multiple physical data stores in a geographically dispersed way within a logical Domain. Each physical datastore may use a different identifier (unique within that data store alone) to refer to the Entity. Entity attributes may also be replicated (redundantly stored) acrossmultiple data stores. More than one system may be responsible for creation of new Entity instances, for updating Entity attributes or for marking thedeletion/deactivation of Entity instances.Principle 31 (Harmonise entity data into a single logical record): The Domain-private Identifier is a “candidate key” in every data store (although it will not replacethe existing primary keys in any of them). It should be used to associate Entity data across multiple physical data stores within the Domain to effectively create asingle logical Entity record. Physical consolidation of Entity data into a single data store is neither necessary nor (in general) cost-effective.Principle 32 (Advantages of the UUID format): Adopting the UUID format for the Domain-private Identifier allows multiple systems to independently create Entityinstances if required. There is negligible danger of Identity conflict, since the probability of two random UUIDs having the same value is extremely low (about 1 in10^33). It is meaning-free, which is a major advantage in avoiding external dependencies. It allows different Domains to implement IAM at different speeds andyet interoperate or even merge when required. It also enables the adoption of an “eventual consistency” model for Entity Identity through the gradual merging offalse positives, which is a more pragmatic approach than an aspiration for 100% consistency at any instant in time.Principle 33 (Intra-Domain Provisioning Event model): Provisioning events are of 5 major types: (1) Entity creation, (2) Entity deletion/deactivation, (3) Entityupdate (add new attribute), (4) Entity update (update attribute value) and (5) Entity update (delete attribute). In cases (4) and (5), if the attribute is multi-valued,then the specific instance of the value to be updated or deleted needs to be identified with its own unique identifier (UUID). In case (3), if the attribute is multi-valued, the provisioning request must receive a response with the unique identifier (say, a UUID) of the newly-created value. Case (4) could include arbitrarynesting of operations on nested attributes. It is of course the implementations responsibility to handle this arbitrarily complex logic, but the data construct tosupport such nested operations is quite simple, i.e., a unique identifier for each value of a multi-valued attribute at any level in the hierarchy.Principle 34 (Attribute Identifier model): Ideally, each value of a multi-valued attribute must have an Internal Identifier and one or more External Identifiers that areloosely associated through mapping. This improves flexibility but unlike with Entities (where splitting and merging of Identities is a constant possibility), the usecases may not always justify the added complexity of having more than one Identifier per attribute. This is completely optional and dependent upon the use casesthat are anticipated.Principle 35 (Employ Master Data Management for provisioning): Adopting Master Data Management (MDM) principles allows Entity data to be kept consistentwithin the Domain in spite of replication. Specifically, every attribute must have a designated source of truth, and all other instances of that data must be deemedto be read-only replicas. Only the source of truth can be updated by business logic. Replicas must only be refreshed from the source of truth. All Entity life-cycleevents must be propagated from the appropriate source of truth, and all data stores holding replicas must listen on these events and update the replicas. This is afederated model, and the unique Domain-private Identifier reference is the required logical link across all data stores.Principle 36 (Reliability of Provisioning): A reliable event delivery mechanism is required to implement this model of MDM. Specifically, messages must bepersistent (they must not be lost even if there is a system failure somewhere) and subscriptions must be durable (messages must be ultimately delivered even ifa subscriber was temporarily off-line). Alternatively, an idempotence model of reliability may be used. Both models assume eventual consistency.19
    • Illustration20DomainData Store AEntity InstanceAttributes(sources of truth)LocalPrimary KeyDomain-private Identifier(Candidate Key)Attributes(replicas)Data Store BEntity InstanceAttributes(sources of truth)LocalPrimary KeyDomain-private Identifier(Candidate Key)Attributes(replicas)Data Store CEntity InstanceAttributes(sources of truth)LocalPrimary KeyDomain-private Identifier(Candidate Key)Attributes(replicas)Entity Lifecycle Events (referenced by Domain-private Identifier)Source oftruth changedReplicaupdatedSource oftruth changedReplicaupdatedReplicaupdatedSource oftruth changed
    • Identity Management and Provisioning - Inter-DomainInter-Domain, or Cross-Domain provisioning is very topical because of the rapid adoption of cloud-based solutions to many problems. The SCIM protocol hasbeen proposed to standardise provisioning operations between Enterprise Customers and Cloud Providers.Cross-Domain Identity Management and provisioning are similar to their counterparts within a Domain, with two important exceptions:1. Domain-private Identifiers are never exposed outside the Domain, as mentioned earlier. Special External Identifiers are shared by two or more Domains torefer to the same instance of an Entity, and they each map these External Identifiers to independent Domain-private Identifiers internally within their respectiveDomains.2. Within a Domain, there are multiple data stores that may hold Entity data, each with its own internal primary key for the Entity. The Domain-private Identifieracts as a candidate key in all the data stores within a Domain and is the logical “primary key” for the Entity within the Domain. This analogy does not holdbetween Domains. There is no “super-Domain” that performs a similar role that a Domain performs for the data stores within it. Hence there is no “super-Identifer”that can act as a “candidate key” within all Domains and also act as a Universal primary key for the Entity.The reason for the different treatment of data-stores-within-a-Domain and Domains themselves has to do with the very definition of Domain. Entity Identity onlymakes sense within a Domain, and the Domain imposes a uniform meaning to that Identity by virtue of the cohesiveness of the Domain, i.e., the commoncharacteristics of the Domain that are shared by all systems and data stores within it. While it may be possible to build associations between specific Entityinstances across two Domains, this is by no means universally applicable. E.g., Some customers may be employees, and hence some Entities in the Customerdomain may be associated with some Entities in the Staff domain, but not every Entity in these two Domains has a similar association.These characteristics lead to the following principle:Principle 37 (Inter-Domain Provisioning Event model): Provisioning events between Domains are more loosely coupled than provisioning events between datastores within a single Domain in that the Entity Identifiers used are always External Identifiers.(The SCIM protocols provision of an “id” and an “external id” seems superfluous and even misleading, because all identifiers in messages must be external.Internal identifiers must never be exposed in messages.)The same 5 types of provisioning events apply as in the intra-Domain case. Instances of multi-valued attributes, no matter how deeply nested, require their ownunique identifiers. (The SCIM protocols treatment of multi-valued attributes is clumsy for the simple reason that it does not mandate this aspect of data design asa requirement.)As before, all events are advisory in nature and there is no notion of “control”. A Domain is free to act in whatever way it chooses to an provisioning event. Theprovisioning message must either be reliably delivered to the Domains that require to receive it, or it must be communicated in an idempotent way so it can beretried without adverse consequences. (SCIM assumes a synchronous request-response protocol that may prove too tightly-coupled and brittle. Time will tell.)21
    • Illustration22The Domain-private Identifier is “global” within a Domain andprovides a uniform mechanism to identify an Entity across all datastores within the Domain.However, there is no analogous External Identifier that is “global” toall Domains. Domains need to agree on shared External Identifierson a bilateral basis, because all Domains are co-equal and there isno global Domain.Forcing the use of a single global External Identifier across Domainscan simplify Identifier mapping within any given Domain butintroduces rigidity in the overall system due to tight coupling betweenDomains. There is a trade-off between the two aspects, and werecommend using multiple External Identifiers as illustrated here.Domain YData Store PEntity InstanceAttributesLocal PrimaryKey “6”Domain-private Identifier “99”Data Store QEntity InstanceAttributesLocal PrimaryKey “8”External Identifier “389” External Identifier “613”Domain XData Store AEntity InstanceAttributesLocal PrimaryKey “1”Domain-private Identifier “55”Data Store BEntity InstanceAttributesLocal PrimaryKey “4”External Identifier “389”Domain ZData Store SEntity InstanceAttributesLocal PrimaryKey “9”Domain-private Identifier “27”Data Store TEntity InstanceAttributesLocal PrimaryKey “5”External Identifier “613”
    • Authorisation / EntitlementsPrinciple 38 (Isolate Coarse-grained and Fine-grained Authorisation): Authorisation rules should be split into coarse-grained and fine-grained. Fine-grainedauthorisation rules should be applied close to where access takes place, such as a single application or data store. Coarse-grained rules can be applied at higherlevels, such as the Domain. Standards such as XACML are seductive but misleading. Too much detail at a more coarse-grained level of control is logistically hardto manage and does not deliver sufficient benefit to justify the cost.Principle 39 (Roles): Roles are a mechanism to standardise access control privileges and apply them across Entities. Coarse-grained roles should be used at theDomain level and fine-grained roles at the individual application level.Principle 40 (Rules): Role-Based Access Control (RBAC) is useful for access to functions. It is generally not useful to control access to data (e.g., specificsubsets of data within a data store). Rules, not roles, that are applied at an individual Entity level are more suitable to control access to data items. E.g., twopersonal bankers offering financial advice and other personalised services to high net worth customers may both have the same role entitlements, but rulesspecific to their individual identities will control which customers data each can see.Principle 41 (On-behalf-of Access model): In situations requiring on-behalf-of access, the Identity of the Entity on whose behalf an activity is being performedshould be treated as an attribute of the Entity that is physically performing the activity. E.g., if a customer service representative is performing a function on behalfof a customer, then the CSRs Identity is the one to be authenticated, and the customers Identity merely established and verified as an authorised attribute of theCSR. This treatment aids the audit function as well, since it primarily records who performed an action, and secondarily on whose behalf the action wasperformed.Principle 42 (Role Propagation model): When one Domain propagates the Identity of an Entity to another, it is clear that the shared External Identifier will beused, and that is sufficient for the Establishment of Identity. However, for the receiving Domain to be able to perform authorisation effectively, it needs to be ableto determine the Roles (coarse- or fine-grained) of the Entity and the allowed Functions for those Roles. This involves some trade-offs.If the first Domain does not pass in any Role information but only the Entitys External Identifier, it means the receiving Domain will have to maintain a mappingbetween the Entitys corresponding Domain-private Identifier and a set of Roles. In other words, the Entity will have to be provisioned within the receiving Domainas well.If the first Domain itself determines the Entitys Role(s) and passes them to the receiving Domain, then the two Domains must pre-agree the set of Roles,otherwise the received Role(s) may not be interpreted correctly. A system of Domain-private and External Identifiers for Roles can solve this problem, if Roles aretreated as Entities in their own right.A third approach is for the receiving Domain to be agnostic to Entity Identity and allow function access purely on the basis of Role information passed in. Thisapproach may not be universally applicable because of the common requirement to maintain audit logs with Entity identity.23
    • Domain 1 Domain 2Domain 1s users need to be provisioned inDomain 2 in terms of their mapping to rolesDomain 2Domain 1 Domain 1 and Domain 2 need to agree oncommon role IdentifiersPropagation of Role Identity24User Role FunctionUser-RoleMappingRole-FunctionMappingUser IdentifierUser Role FunctionUser-RoleMappingRole-FunctionMappingLogical Model: An Authorisation check is performed by following the mappings of User-to-Role and Role-to-(allowed) Function to determine whether anattempted access is allowed or not.Implementation Model 1: Domain 1 only provides the External Identifier of a User Entity. Domain 2 needs to map the corresponding Internal Identifier to aRole and then determine the Function authorisation. For this model to work, a user provisioned into Domain 1 and requiring access to functions inDomain 2 must also be provisioned into Domain 2, either concurrently or in a Just-In-Time manner.Implementation Model 2: Domain 1 maintains a mapping of User identifiers to Roles, and provides the Role Identifier to Domain 2 (along with the optionalExternal Identifier, which is required not for authorisation but for audit purposes). Domain 2 only needs to determine the Function authorisation using thisRole Identifier. For this model to work, the two Domains must pre-agree a set of Role Identifiers.Federation use case: A user provisioned in Domain 1 requires to access a function in Domain 2. Domain 1 establishes the users Identity. Domain 2needs to determine whether the user is authorised to perform the function. The translation from user Identity to function access has to cross the Domainboundary, and it may use more than one model, as shown below.RoleUser FunctionUser-RoleMappingRole-FunctionMappingRole Identifier
    • PartyBelongs to IndividualHousehold OrganisationBelongs to Is sub-unit ofdUUID - Globallyunique, meaning-free,externally-invisibleidentifierIndividualattributesOrganisationalattributesHouseholdattributesIs owned byIs used byProduct/ServiceProduct/ServiceID – abstract IDdecoupled fromphysicalidentifiersIs managed bySample Identity Model for an IndustryDomain Models and the Identity of Domain EntitiesPrinciple 46 (Identify all core Entities): Identity management is often understood to pertain to individuals, specifically “users”. However, Individuals are not the onlyEntities to be considered in a generic IAM system. A more generic Entity is “Party”. Individuals and Groups (such as Households and Organisations) are sub-types of Party.In general, all Parties have Domain-private Identifiers and External Identifiers, but only Individuals require Credentials, because Groups do not log intoapplications. Groups may however have relationships that Individuals do not have, such as when Organisations are billed for services that Individuals(employees) use. Groups also aggregate Individual Identities into more complex structures, such as organisations with multiple levels of business units andemployees associated with each level.A full-fledged Data Model for a business will mesh the Identity Model with other important Entities in the business Domain, and it needs to be rich enoughinternally to be able to model associations such as the examples above.Other Entities that may be of interest in a given industry Domain are Products, Services, Locations, etc. Industry Domain Models exist at various levels ofabstraction for the Banking, Insurance, Telecom and other industries. Similar data principles as enunciated in this document may hold for all these Entities, withappropriate variations.25
    • Domain 1 Domain 2Miscellaneous Aspects of SecurityPrinciple 47 (Appropriate use of Cryptography): Encryption of data for confidentiality is an orthogonal concern to Access Management/SSO (where it is mainlyrequired) and is part of those protocols. Provisioning messages are also sensitive, but here it is more important to prevent spoofing of provisioning requests thanto protect data from prying eyes. So cryptographic techniques are still required on the provisioning side, but more for authentication of event sources andmessage integrity than for encryption of traffic.Principle 48 (Non-trackability): Privacy concerns may dictate that when a domain shares an entitys external identifier with another domain, the unvarying natureof the identifier may itself create a vulnerability that third parties can exploit to track the behaviour and other profile information that may be associated with theentity. In such cases, it is not enough for an entitys external identifier to be meaningless. It must also keep changing at frequent intervals to prevent third partiesfrom tracking the entity. The following illustration illustrates one possible data scheme to implement this privacy requirement.Principle 49 (Audit): Audit of functions is greatly facilitated once Identifiers accompany every activity. On-behalf-of transactions must be logged with the Identifierof the logged-on user and the Identifier of the user on whose behalf the activity is being carried out. The latter is generally available as an attribute of the logged-on user. Audit logs need to be write-once and tamper-proof. When logs across physical systems need to be reconciled, External Identifiers in the log records (inaddition to timestamps and transaction IDs) can help to establish a thread of activity.Principle 50 (Segregation of Duties): The Segregation of Duties principle means that provision needs to be made for two roles to perform each activity, one torequest and the other to authorise. An Entity cannot authorise an activity requested by themselves. The data model presented in this paper can support thisrequirement through a combination of a Rule, two Roles and individual Identities for Entities.26UntrackableIdentifier:“b8d234018c4b”,“41b472a0bef6”,“9d30bd205534”,etc.EntityMap to ExtIdentifierDomain-privateIdentifier “ABC”External Identifier“03736ef741d2”Add random“salt”Salted Identifier:“03736ef741d24f”,“03736ef741d2c3”,“03736ef741d255”, etc.Encrypt DecryptDiscard“salt”Map to IntIdentifierEntityExternal Identifier“03736ef741d2”Domain-privateIdentifier “XYZ”Salted Identifier:“03736ef741d24f”,“03736ef741d2c3”,“03736ef741d255”, etc.
    • Summary and ConclusionsWe said in the introduction that the most common mistake with IAM is in assuming that it is a technology problem. Given the subtlety of many of the dataprinciples we have just seen, and the hugely negative implications of getting these wrong, it should be clear why IAM is not so much a technology problem as adata problem. If the emerging, modern IAM protocols did not exist, it would still be possible to “roll your own”, but if the underlying data model has serious flaws,no amount of technology thrown at the problem will fix matters.The data principles listed and described in this document have been arrived at through hard experience at more than one user organisation. Many of theseappear obvious in hindsight, but it is astonishing that they are mainly observed in the breach.Lets summarise the most common industry mistakes relating to data modelling:1. Lack of clarity on the appropriate Entities in the Domain, and the subsequent use of inappropriate surrogates2. The use of identifiers with business meaning, creating external dependencies3. The use of sequence numbers instead of UUIDs for identifiers, creating real-time dependencies upon a single source4. The exposure of internal identifiers to parties outside the domain, creating external dependencies and preventing splitting/merging of Identities5. Recycling of identifiers, which breaks more things than we can describe here6. An assumption of “control” that spans domains, instead of a truly federated architecture, leading to needless political battles7. Lack of understanding of MDM principles, leading to poor provisioning practices and consequent inconsistencies in replicated data8. Lack of clarity on the Trust model, leading to conflict between the security function and the business/IT functions9. An excessively ambitious authorisation model, with the logistical complexity of fine-grained access control overwhelming the IAM function10. Lack of understanding of cross-Domain authorisation, leading to one-off, brittle solutions that break the overall modelIt is our belief that a meticulous application of the data principles in this document can lead to more flexible and robust data models, and these in turn will enablesimple and powerful protocols for authentication, authorisation and provisioning. Together with audit, these functions form the core of what is called Identity andAccess Management (IAM).We hope these principles will be useful not only to practitioners at end-user organisations but also to IAM protocol designers and product developers.About the AuthorGanesh Prasad is an architect who has worked in the area of Identity and Access Management for over 5 years, covering the banking, insurance and telecomindustries. His experiences in this field, combined with general lessons learnt over a decade of architecture practice and consulting, have prompted him todocument his learnings for the benefit of other professionals. His previous writings in the area of IAM are “Identity Management on a Shoestring” (a how-tomanual on designing a loosely-coupled corporate IAM solution, published through InfoQ - http://bit.ly/ZLv3i9) and “Dont SCIM over Your Data Model” (an InfoQarticle detailing some suggestions to improve the SCIM protocol - http://bit.ly/10G8biT).27