Skip to end of metadata
Go to start of metadata

This page is out of date. It needs an update.

 

midPoint is using objects composed from properties and containers (see Prism Objects). This is an abstract data representation for which XML and JSON are just small parts. XML and JSON is just two of many possible data representation formats. MidPoint objects are defined by the Data Model and provide a static part of the common identity model implemented by midPoint.

MidPoint data representation model is, strictly speaking, not bound to any specific representation such as XML or JSON or ASN.1. However, it was designed with XML in mind and it was also usually used in the XML form.

Why XML?

Although XML is not an ideal mechanism, we have decided to use XML because:

  • XML is popular. Almost everybody understands XML and also the XML-related technologies (XML schema, XPath, ...). Therefore by choosing XML we considerably lower the entry barrier.
  • XML is widespread. There are a lot of implementations of XML parsers, schema processors, path and pointer languages, etc. There are also standard or semi-standard interfaces for such implementations. If one implementation does not work for any reason, there is usually another option that does not completely ruin all the code that we have written to date.
  • XML is mature. While XML is not ideal, the problems are well understood. Also the ways how to work around them and avoid them are (mostly) known. With a bit of care we can design a system that will take advantage of XML benefits while avoiding XML problems.
  • XML supports namespaces. Namespaces are essential for extensibility. And extensibility is critical for integration systems. Namespaces also support natural way how to express compatibility of data formats and interfaces.
  • XML supports schema. Schema is important, as it speeds up development cycle (DRY, problems can be detected during build, support for tooling), it speeds up customization cycle and also helps to define interfaces.
  • XML is extensible. Not only the XML itself, but especially other XML-related specifications. E.g. annotations in XML schema, custom functions in XPath, etc. Extensibility is a critical feature.

Other data formats that we have considered are significantly worse in these aspects. ASN.1 would be good candidate but it is not widespread and only few people really understand that. Therefore it would be a considerable entry barrier.

XML Drawbacks

Type information in XML is usually encoded in XML Schema. While this works well for static XML structures, it is very difficult to use for dynamic parts. The "xsi:type" mechanism may work for dynamic parts, but it is quite cumbersome to use. This is very inconvenient especially for combination of static and dynamic data.

XML documents are difficult to compare unless full schema information is available. E.g. the QNames in element and attribute values cannot be distinguished without a schema. Therefore the it is almost impossible to manage namespace prefixes or even to compare if two documents are syntactically equivalent. This is a major obstacle to correct layering of the system and implementing advanced mechanisms such as relative change model.

JSON

JSON is far from being ideal. It was originally seen as the worst of all realistic data representation choices because:

  • JSON was not widespread when the original design and architecture of midPoint started. This have changed recently but it still influenced the original motivation.
  • JSON is still not mature enough. It is actually re-inventing XML and repeating some of the most severe XML mistakes (see below).
  • JSON has no namespace support. Without namespacing mechanism it is extremely difficult to have extensibility and interoperability at the same time. Which means that reliable and standard support for versioning and incompatibility marking is almost non-existent.
  • JSON has several competing schema proposals, but no real standard and very poor support in code. Schema is crucial for a maintainable system.

Therefore JSON in its pure form is not usable for advanced software system such as midPoint. However JSON also has some advantages:

  • JSON is simple. It is a less "talkative" format than XML. It may be easier to write and read.
  • JSON has better storage efficiency than XML.

Even though JSON has considerable drawback in its pure form we have managed to find a way how the JSON can be efficiently used. MidPoint is using a subset of JSON language and creates its own JSON-based data representation format that works around most JSON problems. The implementation is currently underway.

Objects Are Not Documents

Although the midPoint objects are represented in XML and JSON, they must not be considered XML or JSON documents. ModPoint objects are composed from (potentially multi-valued) properties and containers (Prism Objects). E.g. midPoint does not maintain ordering of values. XML and JSON are just a way how to represent midPoint objects in a readable and "serializable" form. Therefore all midPoint objects can be represented as XML and JSON documents. But not every XML or JSON document is a valid midPoint object.

Extensibility

Practically all midPoint objects have an <extension> section that can hold any custom properties. Therefore it is easy to extend existing object types such as User or AccountShadow. The properties are identified by URI (which is represented as QName in XML), therefore there is a natural namespace allocation and minimized chance of naming conflicts even if several extensions are used at the same time.

New object types are quite difficult to create. The easiest way is to use Generic Object construct to store custom data. The ability for a heavyweight object customization will be most likely provided later on. Yet, there is one issue to consider. midPoint interfaces can convey business objects, but they are designed to primarily support IDM objects (such as user and accounts). midPoint is an identity management system, not a generic business solution. The business objects are supposed to augment the identity management functionality (the midPoint Data Model) and is not designed as a catch-all solution.

See Also

External links

  • No labels