Skip to content

JacksonPolymorphicDeserialization

Russell Howe edited this page Jul 23, 2020 · 17 revisions

Feature: '''Polymorphic Type Handling''', PTH, (formerly known as "Polymorphic Deserialization")

Polymorphic type handling refers to the addition of enough type information so that the deserializer can instantiate the appropriate subtype of a value. This is particularly useful when the value is a subtype even if declaration of the field/setter/creator method only has single type (supertype) defined.

Take the example below:

  public class Zoo {
    public Animal animal;
  }

  static class Animal { // All animals have names, for our demo purposes... 
     public String name;
     protected Animal() { }
  }

  static class Dog extends Animal {
    public double barkVolume; // in decibels
    public Dog() { }
  }

  static class Cat extends Animal {
    boolean likesCream;
    public int lives;
    public Cat() { }
  }

We want Zoo to be serialized AND deserialized properly, i.e. to create an instance of Dog even though we are serializing an Animal. To do so, the serializer can be instructed to embed additional information, to let deserializer know whether we have a Cat or Dog instance.

How? Glad you asked....

1. Usage

There are actually two complementary ways to resolve this problem.

1.1. Global default typing

You can globally declare that certain types always require additional type information. Before going into details of how to do this, please note that there are security considerations regarding the use of this mechanism (explained later in this section).

To enable use of (requirement for) type information for all objects, you will need to enable it by:

  // one of:
  PolymorphicTypeValidator ptv = BasicPolymorphicTypeValidator.builder().build();
  objectMapper.activateDefaultTyping(ptv); // default to using DefaultTyping.OBJECT_AND_NON_CONCRETE
  objectMapper.activateDefaultTyping(ptv, ObjectMapper.DefaultTyping.NON_FINAL);

what this means is that for all types specified (for no-args, "Object.class" and all non-final classes), certain amount of default type information (Java class name, more specifically), is included, using default inclusion mechanism (additional wrapper array in JSON). This global default can be overridden by per-class annotations (more on this in next section).

The only thing you can configure, then, is just which types (classes) are affected. Choices are:

  • JAVA_LANG_OBJECT: only affects properties of type Object.class
  • OBJECT_AND_NON_CONCRETE: affects Object.class and all non-concrete types (abstract classes, interfaces)
  • NON_CONCRETE_AND_ARRAYS: same as above, and all array types of the same (direct elements are non-concrete types or Object.class)
  • NON_FINAL: affects all types that are not declared 'final', and array types of non-final element types.

This is often simplest initial way to enable enough type information to get things going.

It is also possible to customize global defaulting, using ObjectMapper.setDefaultTyping(...) -- you just have to implement your own TypeResolverBuilder (which is not very difficult); and by doing so, can actually configure all aspects of type information. Builder itself is just a short-cut for building actual handlers.

1.1.1 Security Risks using Global default typing

Note that if you:

  • Enable use of global type information, using Class name as the type id AND
  • Accept content from untrusted sources

you may expose a security hole in case untrusted source manages to specify a class that is accessible through class loader and exposes set of methods and/or fields, access of which opens an actual security hole. Such classes are known as “deserialization gadget”s, and although Jackson contains a pre-defined black list for blocking known cases new ones are found over time so that at any given time there may be unblocked attack vectors.

Because of this, use of "default typing" is not encouraged in general, and in particular is recommended against if the source of content is not trusted. Conversely, default typing may be used for processing content in cases where both ends (sender and receiver) are controlled by same entity.

1.2. Per-class annotations

A more granular (and powerful) way to define what type information to add, and where, is to use the @JsonTypeInfo annotation (and possibly couple of related ones). For example, we could have annotated Animal as follows:

 @JsonTypeInfo(use=JsonTypeInfo.Id.CLASS, include=JsonTypeInfo.As.PROPERTY, property="@class")
 class Animal { } 

(which, incidentally is equivalent to the default settings for typing).

What does that mean?

  • All instances of annotated type and its subtypes use these settings (unless overridden by another annotation)
  • "Type identifier" to use is fully-qualified Java class name (like "org.codehaus.jackson.sample.Animal")
  • Type identifier is to be included as a (meta-)property, along with regular data properties; using name "@class" (default name to use depend on type if: for classes it would be "@class")
  • Use default type resolver (no @JsonTypeResolver added); as well as default type id resolver (no @JsonTypeIdResolver)

We could have chosen differently as follows:

  • Type id: possible choices are CLASS (fully-qualified Java class name), MINIMAL_CLASS (relative Java class name, if base class and sub-class are in same package, leave out package name), NAME (use logical name, separately defined, to determine actual class) and CUSTOM (type id handled using custom resolver)
  • Inclusion: possible choices are PROPERTY (include as regular property along with member properties), WRAPPER_OBJECT (use additional JSON wrapper object; type id used as field name, actual serializer Object as value), WRAPPER_ARRAY (first element is type id; second element serialized Object)
  • Property name: for inclusion method of PROPERTY, can use any name; defaults depend on type id.
  • To plug in custom type id resolver use @JsonTypeIdResolver
  • To plug in custom type resolver use @JsonTypeResolver

Finally: it is also possible to use JAXB annotations to indicate need for adding type information (see @XmlElements for details).

2. On type ids

Type ids that are based on Java class name are fairly straight-forward: it's just class name, possibly some simple prefix removal (for "minimal" variant). But type name is different: one has to have mapping between logical name and actual class. This relationship is defined by:

  • Specify sub-classes using @JsonSubTypes annotation: without this, deserializer will not be able to locate sub-types to use
  • Specify logical names for sub-classes using either @JsonTypeName (for type being named), OR list name within @JsonSubTypes entry for sub-class (if both are defined, one defined in @JsonSubTypes has precedence).

Alternatively, you can also use JAXB annotations (specifically, @XmlElements) to establish type names; as well as need to included type information.

In future we may want to add additional methods for linking types with sub-types: current method is not optimal for use cases where subtypes may be added dynamically; and it does add unnecessary back-links between types (even if as annotation metadata).

3. Additional thoughts

Back to choosing between global default typing, and explicit annotations. Which one should I choose?

  • If your system interacts with non-Java systems, you most likely should not use Java class name based type ids: hence, global defaults don't work
  • However, you could use
  • For large number of classes, global defaults are much less work: and

4. On Design

For those interested in actual progress from basic desire ("should be able to serialize any List of Objects") into implementation -- especially one that is quite complicated -- it may be interesting to read through original design notes. Here you go...

4.1. Definition of PTH

Polymorphism is an Object-Oriented Design concept that Java implements by class inheritance. Here it just means ability to construct instances of sub-classes of a given declared class, based on which sub-class was actually serialized. That is, even though during serialization the declared type is a super-type, it should be possible for the deserializer to properly resolve actual expected type of the value to assign.

Implementing support for polymorphic types is one of highest priority items for Jackson development. At the same time, implementation that covers require use cases (including cases where not all communicating systems run on Java platform) is not trivially simple to implement, so care has to be taken to create a simple, powerful and extensible design. To come up with such a design, let's first consider type design choices and alternatives that are available.

4.2. Design Choices

Instance Type Information (Type Metadata)

(aka "Type Id")

To be able to deserialize JSON object into types that instances were serialized from (and not just statically declared type, which is generally a supertype), some amount of per-instance type information is needed. There are multiple possible ways to do this. For example:

  • Directly include Java class name as instance information (possibly either as fully-qualified name, or just relative name to minimize size). This is approach taken by package such as XStream
  • Include a type identifier that can be used to determine actual class: this is often done by using an external type definition (Schema). This is approach taken by frameworks like JAXB.
  • Use some other custom type inclusion methods: type information might not necessarily limited to String values.

To give an idea of possible concrete examples of such instance type information, here are some examples (but please note that other choices discussed below would change actual mechanism of including type information):

  { // Using fully-qualified path
    "@class" : "com.fasterxml.beans.EmployeeImpl", ...
  }

  { // Using indirect type name
    "@type" : "Employee", ...
  }

  { // Fancy custom type information (can bind JSON object to a type object)
    "customType" : { "xmlType" : "http://foo.bar/schema.xsd", "preferredClass" : "com.foo.EmployeeClass" },
    ...
  }

Different mechanisms have different trade-offs; for example:

  • Direct inclusion of Java class names adds direct coupling to implementation classes and may make integration with non-Java systems difficult. But it is often the simplest solution if these limitations are acceptable.
  • Using indirect resolution adds complexity, which may be unnecessary for many use cases. Additionally JSON does not yet have adequate type definition language: the most complete Schema language, http://json-schema.org/ is lacking in its OO type support (focuses more on JSON-centric validation aspects). However, level of indirection allows for flexibility and can work well with heterogenous systems (javascript client, Java service is a common case)
  • Custom type requires some amount of custom handling, but allows for ultimate flexibility for use cases that need it.

Methods for embedding Instance Type Information

(aka "Include As")

After deciding what type information (usually a Type Id of some kind) to add along with instance data, it is necessary to define how this information is to be included in actual JSON data. Some obvious methods of inclusion are:

  • Include type information directly as metadata properties, along with actual instance data. One challenge here is that whereas XML has mechanisms for dividing namespace (both via attribute/element choice and by using XML Namespaces), all JSON properties live in same namespace; that is, there is no natural division between data and metadta
  • Use a wrapping mechanism, similar to how JAXB uses @XmlElements annotation (which is one way to achieve PMD)
  • This style limits type metadata into JSON Strings, or types that can be converted to/from Strings (enums, numbers, other simple scalar types)
  • JSON offers two obvious choices: JSON Object and JSON Array wrapper (see below)

Given examples from previous section, here are possible ways to embed class name information:

  // Type name as a property, same as above
  {
    "@type" : "Employee",
     ...
  }

  // Wrapping class name as pseudo-property (similar to JAXB)
  {
    "com.fasterxml.beans.EmployeeImpl" : {
       ... // actual instance data without any metadata properties
    }
  }

  // Wrapping class name as first element of wrapper array:
  [
    "com.fasterxml.beans.EmployeeImpl",
    {
       ... // actual instance data without any metadata properties
    }
  ]

4.3 Customizing type handling

Here are some aspects that should be customizable:

  • It should be possible to add handlers to allow both custom type id handling (not just use class names or type id), and to allow alternative type metadata inclusion methods.
  • In addition to per-type (class) annotations, it may be necessary (or at least very useful) to allow defining "default type handling" for classes of types: for example, enable type information inclusion for abstract types and interfaces.
  • For resolving type names, a handler is likewise needed; this should be configurable, with some sensible default handler.
  • When including type information as JSON properties, name of property to use should be configurable. For convenience there should be default property name that is based on type information style (for example, "@class" could be the default when using Java class name, and "@type" when including type name)

4.4. Implementation Plan

Current thinking is to support multiple Type Metadata methods as well as multiple inclusion methods. And since these are somewhat orthogonal, allow different combinations of the two. And finally make aspects configurable with sensible defaults, to try to fulfill sometimes conflicting goals of simplicity (as close to "zero configuration" as possible) and configurability (ability to customize behavior to exact needs and preferences).

As with other configurability for Jackson, emphasis will initially be on allowing configuration using Java Annotations. Benefits include some level of type-safety, as well as flexibility when using Annotation Mix-ins.

4.4.1 Baseline: global defaults

In addition to per-type definitions by annotations, it is necessary to allow setting of global baseline. The main reason is convenience: although it may be technically possible to annotate all possible types (or might not...), it may be unfeasible for practical purposes. As such it is good to be able to define a default type handling baseline, different from default of "no type information".

4.4.2 Per-type annotations

All annotations that do NOT depend on mapper types, will live in the main org.codehaus.jackson.annotate package. Other annotations (with dependencies) will live in org.codehaus.jackson.map.annotate package.

The main annotation used for indicating type is @JsonTypeInfo. It has following properties:

  • use() for indicating type if mechanism used:
  • value: Id enumeration, choice of (CLASS, MINIMAL_CLASS, NAME, CUSTOM)
  • include() for indication type inclusion mechanism:
  • value: As enumeration, choice of (PROPERTY, WRAPPER_ARRAY, WRAPPER_OBJECT)
  • property() used with PROPERTY inclusion mechanism, to indicate alternate property name to use (type ids define default property to use)

CategoryJackson