MidPoint was designed with a flexible implementation of its repository subsystem, which allows support for several types of storage technologies. The only storage technology that midPoint supports currently supports is relational database. Relational database always was, and it still is, the reliable workhorse of many identity management solutions, including midPoint.
Since its very beginning, midPoint was using a very generic approach to relational databases. This approach enabled support for several databases using the same code base. This worked very well for many years. However, there is also a drawback. Generic database code must strictly adhere to standardized SQL and it cannot take advantage of any database-specific features.
As midPoint deployment grow in complexity and size, we are reaching the boundaries of generic database support approach. Therefore we have to specialize our implementation and choose our champion among the database engine.
We have chosen PostgreSQL database. PostgreSQL is undoubtedly one of the best open source database engines. As midPoint is open source project, it was crucial to support an open source database. As we have doubts about the open source character of MySQL and the capabilities and community strength of MariaDB, PostgreSQL was a clear choice. That is also the reason that we have decided to focus on PostgreSQL and PostgreSQL was a recommended database engine since midPoint 4.0 release.
We have long-term plans to create special repository implementation for PostgreSQL database engine. This implementation will be able to fully utilize PostgreSQL features, therefore it can provide higher performance, scalability and efficiency.
However, re-working our repository implementation is no easy task. It will take time and energy. It can be difficult to focus our energies on that while at the same time we had to support a horde of other database engines. Therefore we have decided to reduce the number of database engines that we support in midPoint. It is still not clear how exactly will the future database support for midPoint look like. But few things are clear:
- PostgreSQL is the priority. PostgreSQL will be supported and recommended for all new midPoint versions. Newest PostgreSQL releases will be supported by new midPoint releases (given enough time to test new PostgreSQL releases).
- Support for MySQL and MariaDB is deprecated since midPoint 4.1. We plan to drop support for MySQL and MariaDB. There is no specific date to drop the support, but it will happen eventually. Users of MySQL and MariaDB should migrate to PostgreSQL soon.
- Other commercial database engines (Oracle, Microsoft SQL) remain in the gray zone. We have not made any decision about these. They are currently supported. However, it is quite likely that these databases will not be included in standard midPoint support service. Support services for these databases will need to be purchased separately - to compensate for extra effort supporting them. It is quite clear, that these database engines are not our priority. We currently do not plan to support any newer database versions for these commercial databases. Users that require clear database support plans should choose PostgreSQL database instead.
- Support for H2 is likely to remain for development and demonstration purposes.
The bottom line is, that if you start a new midPoint deployment, the best choice is to go with PostgreSQL. If you have existing midPoint deployment then you are probably OK to continue operation with current setup, but plan to migrate to PostgreSQL in the future (see below).
Migration to PostgreSQL
Firstly and most importantly, there is no need for panic or any other quick action right now. PostgreSQL is the recommended database, but other databases are still supported. And other databases will be supported until there is a clear migration path to PostgreSQL. There is no need to migrate your existing deployment right now. Just please stay alert, follow the release notes of midPoint releases (especially LTS releases) and watch for new. You will probably need to migrate to PostgreSQL eventually. But there is still time to prepare: both for you and midPoint.
We have the following plan for database support:
- Develop native repository implementation for PostgreSQL. This implementation will work only for PostgreSQL and it will take advantage of all the database features that we need. This will come with a new and more efficient database schema. For now this is roughly planned for midPoint 4.3 and 4.4.
- Create a native migration plan from some of the supported database to PostgreSQL. The details still need to be decided. The likely candidate for this is Oracle, but nothing is certain yet. Native migration means direct migration from Oracle database tables to PostgreSQL database tables, which should be fast and efficient. For all other cases there is an existing migration path using XML export/import. This is still a viable migration option, but it is slower.
- It is very likely that both the new native PostgreSQL repository and the old multi-repository implementations will be supported in midPoint 4.4 LTS, planned for release in late 2021. Expected lifetime of midPoint 4.4 LTS is until late 2024. Therefore you will have at least three-year migration period to PostgreSQL if you stick with midPoint 4.4 LTS.
- The plans beyond midPoint 4.4 LTS are still very fuzzy. We will probably start dropping support for some databases starting from midPoint 4.5. MySQL and MariaDB are very likely to be first. Future of support for other databases is still a matter of discussions. We are trying to gather user feedback to help us to create a smooth migration plan.
Support For Database Engine
MidPoint is using database for its data store. MidPoint needs database engine, but it does not include database engine. Database is installed, configured and managed separately from midPoint. The database engine is not considered to be part of midPoint. Therefore the database engine itself is not included in Evolveum support for midPoint.
Evolveum will make reasonable effort to make sure that midPoint works well with the databases. We are testing midPoint with several databses, using various versions and configurations. We are trying to make sure that midPoint works with the common database engines in common configurations. We will fix the bugs in midPoint database-related code as part of our support problems. In some cases we will also make work-arounds for some frequent and annoying database issues, although such decisions are made on case-by-case basis. Our responsibility is to make sure that midPoint side of midPoint-database interface works well.
However, deployment, configuration and maintenance of the database engine itself is not our responsibility. You have to install the database engine yourself. You have to configure it. We may have some recommendations for you, but database configuration is your responsibility. There are lots of variables here, single-node databases, clustered databases, many performance and availability trade-offs. Those depends on your environment, your requirements and workloads. We cannot possibly make such decisions for you. We provide database schema that midPoint needs. But fine-tuning of the database is up to you. There are small deployments and big deployments, each of them may have different database tuning requirements. For example, audit table may work with default setting in a small deployment that cleans up audit records after few months. But a large deployments that keeps audit records for years will need a very specific solution for storing audit data. Such solution is your responsibility.
Also, we will not fix the bugs in database engine. We will recommend specific version of database engine, but we are not distributing or maintaining the database engine itself. If midPoint does not work and the problem is a bug in the database, the solution will be to fix the bug in the database. Making sure that the bug is fixed is your responsibility. You should have a means to do it. This is usually a support contract with the database vendor or in-house capability.
From a practical perspective, midPoint usually works well with recent versions of PostgreSQL in usual configurations. Small midPoint deployment that are not performance-sensitive can probably operate without any problems even without any special support coverage for the database engine, just making sure that the database engine is properly maintained. Larger midPoint deployments will surely benefit from a dedicated PostgreSQL support. Such support contract can surely be consolidated with other applications in your organizations that are using PostgreSQL and it is also a great way how to support PostgreSQL project. Therefore we always recommend to secure a support for PostgreSQL database if you can afford it.
As for commercial and semi-commercial database (Oracle, MySQL, MS SQL) we always strongly recommend to purchase a support contract for database engine if you insist on using such database. However, perhaps the best strategy would be to migrate to PostgreSQL eventually.
Clusters and Cloud
Generally speaking, midPoint is supported in clustered database environments. Simply speaking: if midPoint works for you with a single-node database, then it will be most likely work for you also in when deployed with database cluster. However, there are limitations:
- Only environments that support full consistency guarantees are supported. Which means, that midPoint can only work for clustered configurations that can provide full ACID consistency and that are also configured to provide such guarantees. MidPoint will not work in environments with read-only replicas, environments that provide eventual consistency or any weaker consistency guarantees.
- Proper configuration of database clusters is a complex task that often involves trade-offs. For example clusters built for high availability and robustness may increase data maintenance overhead and it may result in lower overall performance. Analysis, design, proper configuration and maintenance of database clusters is your responsibility. Evolveum support will not resolve issues that are caused by inappropriate design or configuration of database clusters. It is unrealistic to expect that midPoint will be highly available or more performant just because it runs on a database cluster. The cluster has to be properly designed and configured to satisfy specific needs of each deployment.
- Database clusters may be configured in a variety of ways. Even a small configuration or tuning changes may cause issues. Even though midPoint is tested in a variety of database configurations during development, it is unrealistic to expect that it can be tested for every combination of database engine, versions, configurations and clustering topologies. If you happen to experience an issue with midPoint operation, we have to reproduce the issue in order to have any realistic chance to fix it. Some issues can be reproduced in our testing environment. However, presence of database clustering makes reproduction of issues much harder. Therefore please be prepared that Evolveum team may request an access to your testing environment where the issue can be reproduced in order to diagnose and fix an issue.