Identity Provisioning is a technology originally designed for enterprise account synchronization and management of identity-related business processes. This technology originated in the enterprise environment but it soon became obvious that it has much broader applicability. Today it is almost impossible to find an identity-related solution that can be complete without any kind of identity provisioning component. However there is still a significant number of engineers that do not realize this fact and continually try and fail to deploy partial identity management solutions. This approach was perhaps justifiable in 2000s when provisioning technologies were unreliable and expensive. But today there is no excuse for omitting provisioning technology from an identity management solution. The following paragraphs explain the reasons.

What is the Problem?

A huge number of identity management solutions are based on access management technologies. This includes Single Sign-On (SSO) solutions, identity federation, Internet "user-centric" identity solutions, etc. However almost all access management technologies have a critical weakness: transfer of user profiles. The very principle of these technologies and their current implementation means that the transfer of user profiles is inherently bound to a user accessing the system (this is "access management" after all). The usual flow goes like this:

This is the theory. But it is also often used in practice and even recommended as a satisfactory and complete solution. However the truth is that this approach is satisfactory only for very simple applications. It works only for applications that do not need to do any serious processing with the identity-related data. On the other hand vast majority of non-trivial applications needs to locally store at least some portions of user profile. There are numerous reasons why the applications need to do so. Perhaps the most pressing requirements are:

Practical experience clearly  indicates that vast majority of non-trivial applications needs to store parts of the user profile locally. The applications just cannot efficiently work without it. Therefore there is usually one more step in the access management flow described above:

This approach apparently solves the problem. It looks like a satisfactory solution applicable to a very broad range of applications. It is sometimes even recommended as a best practice. However the reality has a couple of surprises.

A system that does this "on demand" management of user profiles may actually work very well after the initial deployment. This is the reason why this method is so often used and recommended. However some time after the deployment the things start to turn for the worse. Critical issues begin to appear. The primary cause of the problems is the on-demand synchronization of user profiles which is very far from being perfect. It is actually closer to a "hack" than a real solution. Such synchronization happens only when user is accessing the system. Therefore it works very well for users that access the system for the first time. That explains why such a system works very well when deployed: all the users are accessing the system for the first time. It also works acceptably well if the users are accessing the system frequently. And frequent access is what typically happens when the system is tested and also during the pilot deployment. Therefore the issues may go undetected for quite a long time and such system may pass all the testing with flying colors. But the issues start to appear later in the system lifecycle when there is a longer delay between user visits. The synchronization becomes less frequent and the data quickly get out of date. Even such a benign data item as a user full name changes surprisingly often. There are thousands of marriages every day and each of them is likely to end up in the change of user full name. And the wedding is still quite a rare event in the life of an individual. Changes of telephone number, e-mail address, postal address, locality and organization affiliation are much more frequent. User profile is much more dynamic than an average engineer would ever suspect. Locally stored user profile data become outdated surprisingly quickly. Under normal operational circumstances this is usually a matter of days or weeks rather than months or years.

When locally stored data become outdated then a number of severe problems appear:

These problems also apply to identity federation and Internet-based (user-centric) identity solutions as these are based on similar approach. Therefore similar reasoning also applies to solutions based on OpenID, SAML and WS-Federation, OAuth and similar technologies. Similar problems also apply to the very popular APIs that form the "API economy" as these also frequently create copies of user profiles using the on-demand fashion.

Some deployers of access management solutions are obviously aware of these problems. They usually try to resolve them with "hacks" such as clean-up of account data that is inactive for several months. But such techniques are usually very unreliable, they cause severe user discomfort for non-frequent users and they still do very little to address the security risks and privacy concerns. Except for a very few cases such "solutions" are absolutely inadequate.

The knowledge about these severe problems is relatively widespread among engineers that work on enterprise solutions. These concerns were documented as early as 2006 and the general knowledge was there ever before that date. However these facts are very little known in the Internet-oriented environments. It almost looks like the Internet-oriented engineering community is ignoring the issues.

Identity Synchronization and Provisioning

It is impossible to avoid duplication of any data in network. Actually any transfer of data over the network creates a copy. Sometimes such copy lives only for a couple of moments but it is not unusual for the copy to exist for months or years. As we cannot avoid copying the data, we need to manage the copies. And that is what we call synchronization: a process to create, maintain and dispose of data copies in a manageable way. All the identity provisioning systems are built around this concept in one way or another. Some are lightweight and elegant others are heavyweight cumbersome monsters. But all of them essentially synchronize identity-related data. It essentially works like this:

This is fundamentally different principle than those techniques used in access management technologies. It is not event-driven but data-driven. The trigger is not user access but it is the change in the data. Therefore it is much more reliable. Especially when combined with additional techniques such as reconciliation to increase the reliability. Some advanced provisioning systems also have a self-healing capability and can correct the data immediately after a problem is discovered. The data also does not need to be the same. They can be transformed and adjusted as they flow between the systems.

Identity synchronization brings substantial advantages:

Provisioning technologies are a great success in some environments such as the enterprise, eGovernment and inside cloud solutions. But they cannot go alone into the Internet. It is clear that both access and provisioning parts of the solutions are essential for such solutions. However while it is quite well accepted that an Internet-based identity solution should have an access part the provisioning part is almost always neglected. But it is essential. The solution just cannot be complete without it.

In the past the provisioning systems used to be very expensive toys. And many of the products from that era still survive on the market. But there are also new products. These are built on newer technologies and on years of experience. These are not toys anymore and they are not expensive either. Some of the new products have business models that are feasible for management of large number of identities. These can be deployed in cloud solutions, used to manage customer identities, social network identities and so on. These products are built for the Internet age.

Example: Evolveum midPoint

Couple of years ago there was no suitable solution for the Internet scale. Provisioning was strictly confined to enterprise boundaries and it was very expensive. But all of that is over.

MidPoint is one of several open source identity provisioning systems. The project went through several years of very rapid development and created a very unique, flexible and efficient system. MidPoint is a next generation provisioning system. It is built on lightweight technologies that the Internet has produced. Technologically it has very little in common with its predecessors. However midPoint is designed to accomplish everything that the previous generations of products did - and much more. MidPoint comes with a very interesting pricing model especially suitable for cloud deployments, management of customer identities or similar large-scale deployments.

MidPoint has feature suitable for Internet environment:

There are also features essential for the enterprise:

MidPoint is an open source system with a full commercial support. This is essential because:

See Also