The central concept of identity management is usually a data record that contains a collection of data about a person. This concept has many names but the most common are: account, persona, user record, user identity. Accounts usually hold the information that describes the real-world person such as person's given name and family name. But probably the most important part is the technical information that relates to operation of an information system for which the account is created. This includes specification of home directory, wild wide variety of permission information such as group and role membership, system resource limits, etc. User accounts may be centralized and unified, distributed and unaligned or anywhere between these two extremes. But regardless of the architecture the aim of identity management is the management of accounts.
Identity Management Technologies
Identity management is not a single technology. In fact it is a wild mix of various technologies that both complement and overlap each other. There are at least three main technological branches in the identity management:
Although these technologies formally form a single field of identity management, their purpose and approach is significantly different. Any complex identity management solution will need at least a bit of each of them. These technologies are be explained below in much more details.
Accounts are stored in the databases called identity stores. The underlying technology of the database variesUnderlying technologies of these databases vary, ranging from flat text files through relational database to directory servers. Especially directory servers accessed by LDAP protocol are very popular because of their scalability. Identity store may be integrated with the application that is using it or it may be a shared stand-alone system.
Shared identity store is making user management easier. The An account needs to be created and managed on in one place only. Authentication happens in each application separately. But as the applications use the same credentials from the shared store, the user may use the same password for all the connected applications.
Identity management solution solutions based on shared identity stores are simple and quite cost-efficient. But capabilities of such solutions are considerably limited.
Identity stores are just that: storage of information. The protocols and APIs used to access such databases are primarily designed to be database interfaces. It means that they are excellent for storing, searching and retrieving data. While the data in account may contain entitlement information (permissions, groups, roleroles, etc.), identity stores are not well suited to evaluate them. I.e. identity store can provide information what permissions an account has but it cannot make a decision whether to allow or deny specific operation. Identity stores also do not contain data about user sessions. It means that identity stores do not know whether user is currently logged in or not. Some identity stores are frequently used for basic authentication and even authorization, especially LDAP-based directory systems. But the stores were not designed to do it and therefore provide only the very basic capabilities. Identity stores are databases, not authentication or authorization servers.
Meta-directory is a special directory system that is synchronizing synchronizes several "normal" directory systems. Meta-directory is copying copies data. Therefore the performance of the original synchronized directory systems is mostly unaffected.
Virtual directory is a proxy and a protocol converter. Virtual directory creates unified view from several "normal" directory servers. Unlike meta-directory the data are not copied. Virtual directory is fetching fetches the data from the original directory on each request and converting converts them to the the unified view.
Shared identity store is making user management easier but this is not not a complete solution and there are serious limitations to this approach. The heterogeneity of information systems in the common medium-to-large enterprise environment makes it nearly impossible to implement single directory system directly for the following reasons:
- Lack of a single, coherent source of information. There are usually several sources of information for a single user. For example, a HR system is authoritative for the existence of a user in the enterprise and for assignment of employee identifier. The Management Information System is responsible for determination of user's roles (e.g. in project-oriented organizational structure). The inventory management system is responsible for assigning telephone number to the user. The groupware system is authoritative source of the user's e-mail address and other electronic contact data. There are usually 2 to 20 systems that provide authoritative information for a single user.
- Need for a local user database. Some systems must store the copies of user records in local databases to operate efficiently. For example, large billing systems cannot work efficiently with external data (e.g. because relational database join is not possible). Legacy systems usually cannot access the external data at all (e.g. they do not support LDAP protocol).
- Stateful services. Some services need to keep state for each user to operate. For example file servers usually create home directories for users. While the automation of state creation can usually be done on-demand (e.g. at first user log-on), the modification and deletion of state is much more difficult.
- Inconsistent policies. The role Role names and access control attributes may not have the same meaning in all systems. Different systems usually have different authorization algorithms that are not mutually compatible. While this issue can be solved with per-application access control attributes, the maintenance of these attributes may not be trivial. A complex tool for transformation and maintenance of access control attributes (usually roles) may be needed.
Even using meta-directory or virtual directory mechanisms may not provide expected results, as such systems only provide the data and protocol transformation, but do not change the basic principle of directory services. A more complex approach is needed to manage the userusers' s records in heterogeneous systems, especially in large enterprise environment.
Single directory approach is feasible only in very simple environments of or almost entirely homogeneous environments. In all other cases there is a need to use also other identity management technologies.
Provisioning systems integrate many different identity stores. The goal of provisioning systems is to keep the identity stores as synchronized as possible (and practical). Priority of provisioning systems is to be non-intrusive. Provisioning systems do not try to change existing account data models in the applications. A provisioning system tries to adapt its own mechanisms to match the data model of each connected system. Provisioning systems are therefore quite complex and needs need to be customizable and programmable. Adaptation of the data models is frequently done by using complex rules and expressions.
Provisioning A provisioning system is just managing existing data stores. It is not doing any authentication or authorization on behalf of the application, ; that is a job of access management. Therefore a provisioning system is affecting affects the enforcement of security policies indirectly by manipulating data in other systems. Provisioning technologies are focused on application back-end without affecting the front-end in any significant way.
- Connectors are pieces of code running on the side of a provisioning system. In this aspect they are similar to the database drivers. Connectors expose application's objects (accounts, groups, ACLs, ...) to the provisioning system. Connectors use various kinds of remote protocols or APIs for that purpose. Connectors are non-intrusive and do not requite any installation on the application side.
- Agents run on the application side. Similarly to connectors, agents are exposing application's objects to the provisioning system. Agents are intrusive and require installation (and integration) on the application side. However, agents can use also local APIs and may be much more powerful than connectors.
Why do we even need provisioning systems? Isn't is easier to just deploy one single unified identity store such as LDAP server? Yes, it is easier. But it is possible only in a very simple situations (see Single Identity Store Myth above). Even if technical architecture favors the single identity store approach, there are still non-technical issues. E.g. the single identity store will not appear in a day. Its deployment and integration may take a long time. Provisioning system is needed in the meantime. Also the applications cannot adapt quickly. E.g. many applications support LDAP authentication out of the box. But LDAP authentication is sufficient only for very simple applications. Complex applications usually needs local data records: accounts. Even is if such accounts do not contain credentials (passwords) they still contain authorization data (roles, privileges, organization unit membership) that are not stored in the central identity store. Other application needs applications need local data records to be able to do database join e.g. for the purpose of reporting. And even if the application can theoretically work with single identity store it may take years to make it work practically. In such cases provisioning system can provide solution much faster and often also less costly.
The deployment of a provisioning system is usually quite a complex project. Not because the technology itself is complex but because the problem that the project solves is complex. If you need to deploy a provisioning system it s very likely that you have many identity stores to integrate, several sources of information that are only partially authoritative, messy business processes and so on. Even though provisioning deployments are complex, it is the best solution to these problems that we know of.
Provisioning system systems are always customized during deployment. This may be a small customization or a huge one, but some customization is always there. The most important difference among between provisioning products is the approach to customization approach. Some products are little more than a platform that requires to develop almost everything during deployment (e.g. OpenIDMv2). Such products are extremely flexible but may be relatively costly to deploy especially if your environment is quite the usual one. Other products implement many common IDM scenarios out of the box while still allowing some space for customization (e.g. midPoint). These products are generally easier and less costly do deploy but may not be suitable if your environment is miles away from the usual thing. There is no "one size fits all" when it comes to provisioning. It is important to select the right tool for the job.
- MidPoint: complex and efficient complete provisioning system.
- OpenIDM: flexible and programmable provisioning platform.
- Syncope: provisioning system build built on top of relational database.
Access Management deals with user authentication and (partially) authorization. The goal of access management is to unify the security mechanisms that take place when a user is accessing a specific system or functionality. Access management technologies are focused on application front-end as opposed to provisioning which is focused at back-end. Access management changes how is the user authenticated and authorized to access the applications.
Following The following figure illustrates theoretical case of the access management deployment. The access management systems acts as a mediator to all access to all applications. Access management system authenticates and authorizes the user based on the identity information stored in the identity repository. In case that all access checks pass the user is allowed to access the application.
- Almost all applications already implement authentication and authorization mechanisms so almost no simplification is applicable in a common case. It may even be quite difficult to replace existing mechanism with access management, which may significantly complicate the system.
- Access management system assumes an existence of a single unified identity repository. But that is seldom the case unless the repository is a result of other identity management mechanisms (e.g. provisioning).
- Access management system knows very little about internal structure of the applications. Therefore the ability to decide and enforce authorization is severely limited. E.g. the access management system can decide if a user can access application
Aor not. But it cannot decide if the user is authorized to modify property
fooin a record number
1234in that application. Therefore applications must very often implement their own additional authorization mechanisms. For that reason the applications must maintain their own user records (accounts) or must have back-end access to the identity repository.
- Access management can provide authorization services only if a user is accessing the system. While that is usually the case there is still significant number of cases when an operation has to be executed on behalf of the user while user is not online. E.g. scheduled tasks, asynchronous invocation, automated reaction to external messages, etc. Access management technology cannot handle these cases by its own.
Access management is an umbrella term for quite a wide range of mechanisms. Some access management systems deals deal only with authentication or single sign-on (SSO), others also deal with authorization, some are focused mostly on web applications, while other work only in enterprise environment where client machines can be strictly under control. Individual access management systems provide partial solutions to the identity management problems and they almost always must be combined with other identity management technologies.
Although Enterprise Single Sign-On (ESSO) has SSO in its name, it has very little in common with other SSO mechanisms. ESSO is in fact just a very simple agent that is silently waiting for login dialog to appear. When the dialog appears the ESSO agent fills in username and password and submits that dialog. User usually notices nothing of this and therefore he thinks that he was already logged in. This creates an illusion of SSO.
Access management systems usually do at least some kind of authorization. However the ability of the access management system to provide authorization services is significantly limited. The access management system knows who is accessing the system (subject) but has only very rough idea about the operation and knows almost nothing about the object that the operation affects. Yet, knowing all three parts of the authorization triple is essential requirements requirement for good authorization decisions. Therefore access management systems can do only a rough authorization decisions such as: "allow (any) access to system
Foo", "allow HTTP POST operations to URLs that match the pattern
If a finer authorization is required that that must usually be done by the application itself. The application knowns knows all the details about operation and object but does not have the details about subject. Therefore an application must be able to get the details of the authenticated user from the access management system.
A simple federation scenario is illustrated n in the following figure. Each color means a different organization connected over the Internet. One of the organizations is an Identity Provider. This organization maintains an identity repository that is used to authenticate users. After user logs in Identity Provider issues an assertion (federation token) to the user. This assertion is used as a proof of authentication. It may be presented to other organizations (Service Providers) that will let the user in. The assertion may be quite rich, e.g. containing also user attributes, privileges, roles, authorization decisions, etc. This may be used for further authorization by the Service Providers.
Access Management is a front-end identity integration technology. It means that it is changing the way how user interacts with the application. Even though the only aspect that changes is usually the way how user authenticates to the application there needs to be a change. The application or its supporting framework needs to have support access management solution. This least intrusive case is to configure the framework (e.g. Java EE application servers) or install a special-purpose agent. This way may be almost transparent for application when it comes to authentication and roughcoarse-grain authorization. But if the access management needs to be integrated more tightly then the modification of the application is almost unavoidable.
Access management technologies seldom care about local state of the application. Therefore if an application needs a local user record the application must create it on demand when a user is first accessing the system. This is a common case especially in federated deployments. So the user gets provisioned automatically, on demand. But the user never gets deprovisioned. If the user account is deleted in the Identity Provider repository it just disappears. Service Providers are not notified. The user data remains on Service Provider side indefinitely. This is a potential risk of data exposure especially if additional (local) authentication or credentials reset was configured. But in any case it waste wastes resources and may cause unnecessary cost e.g. if per-user service pricing model is used.
None of the identity management technologies provide a solution of its own. Perhaps except for the very smallest and simplest identity management projects any practical solution requires a combination of several technologies.
Provisioning system is synchronizing accounts and user records through the organization. It pulls data from various data sources such as HR and CRM systems and creates an unified view of such data. It is keeping keeps the databases of legacy and stateful systems in sync. However its most important responsibility is to maintain shared identity repository. The identity repository is used by applications that are able to do so. It is also used by the access management system as a coherent and authoritative user database.
Waterfall-like project will not work. It may succeed to meet project goals but it will not bring expected value to the customer.
- Provisioning system to improve password reset procedures, speed up user management, make audits more efficient and save a lot of operational costs. While the incentive to deploy provisioning system is mostly economic, there is a significant technical incentive as well: Provisioning system will clean up the data. The data will be ready for the next step.
- Identity repository that is shared among applications and provides an unified view for employee, customer, contractor, partner and similar records. Provisioning system may populate and (most importantly) maintain the data.
- Access management that is using the repository to provide SSO and somehow centralized rough authorization for the services that are able to integrate with it.
Some steps may be reordered or even skipped. E.g. the first "Basic provisioning" step may be skipped if there is already a solid repository (e.g. Active Directory instance populated by with employees). But that task will come back. The next iteration over provisioning will be more difficult (e.g. if customers and partners also need to be in the repository). The effort might be slightly optimized but it will not change much on the overall project shape.
- Integrate the systems to the shared identity repository if possible, especially if it is accessible by a standard protocol (e.g. LDAP). This is the most cost efficient way.
- Create a schema that suits most applications. Use a standard schema as a baseline (e.g. LDAP
inetOrgPerson). Almost all deployments will need some schema extension and customization.
- Do not over-complicate the schema. However hard you try the schema cannot suit all the applications at the same time. Special cases and complex data models are better handled by provisioning system.
- Resist the attractiveness of forcing a single directory server approach to all systems, e.g. by way of security requirements or purchasing rules. If the system supports the right protocol (e.g. LDAP) and the schema is a good fit for the system then connect the system to the central repository. If it does not then do not force it. Common shared identity schema is inevitably a compromise. Forcing everybody to use a compromise creates a mediocre solution that brings little value. And it creates severe problems. Individual applications will still need to maintain their own accounts to store data that are not in the shared schema (e.g. fine-grained privileges, locations, access zones, etc.) You cannot avoid problems caused by extra state if you just hide the state behind a paper screen of policies and purchasing requirements. The state will be still there. The applications will just try to synchronize their own accounts and the central repository to comply with the policy. But this in fact creates a special-purpose provisioning system for each application. Such synchronization is extremely difficult to do well and it is almost never done well. Therefore it creates constant operational problems and it is a nightmare to maintain. And at the end of the day it is all very expensive because the same work is done over and over again for each system. Use central provisioning system instead. It is cheaper, easier to maintain and it keeps the data under control.
- Prioritize your requirements. This is software and almost anything is possible. But some things take too long and cost too much. Use the technology to solve problems that it can solve efficiently. The technology evolves. What cannot be efficiently solved today may become an easy task next year.
- Think 80:20. 80% of result should be achieved by 20% of work. It may be possible to automate 100% of all identity management tasks but it might be extremely expensive. Aim at economic efficiency not technical excellence. 80% automation that costs just 20% may be exactly what you need.
- Do not over-complicate the roles (RBAC). Role Explosion is a common problem and definitely a dead end for an RBAC program.
- Consider to home-brew a solution. It makes no sense to reinvent existing products so use them instead. Take a couple of existing complementary products and try to customize them and put them together yourself. This DIY approach provides surprisingly good results especially if it is implemented in iterations. Nobody knows your requirements better than you do. The lack of IDM expertise may be somehow compensated by advice from product vendors, developers or partners.
- When choosing or implementing a solution make sure you consider a consistency mechanism. It may seem as an implementation detail from the distance, but this little detail may ruin entire project. And there are too many products, mechanisms, APIs and protocols that have it completely wrong.
Identity management is a mix of many technologies, a mix that can create both healing and deadly elixirs. The best approach seems to be pragmatic: to avoid big expectations, to improve the system where an improvement is needed and where it is economically and technically feasible. Identity management is no magic, it is just technology. And quite a young one.