Preserving References#

Preserving references means maintaining “is” relationships through the serialization / deserialization process. The automatic preserving of references is managed by the Processor and the behavior is set by the following Semantics:

AutoPreserveReferences

class AutoPreserveReferences(Semantic[bool]):
    """
    The formatter will keep track of objects that are referenced more than once in the object hierarchy and automatically
    convert subsequent instanced of the same object to a PreservedReference
    """
    pass

ResolvePreservedReferences

class ResolvePreservedReferences(Semantic[bool]):
    """
    Preserved References are resolved by the formatter and never given to the object. This may be slower. but
    it ensures that the object will never have a property set that is of type PreservedReference. When this is not
    present the formatter should not resolve the preserved references. Objects can resolve them by subscribing to the
    context objects
    """
    pass

DetonateDanglingPreservedReferences

class DetonateDanglingPreservedReferences(Semantic[bool]):
    """
    This will call a method that raises an exception if any tracked PreservedReference has not been flagged for garbage
    collection at the end of the deserialization process. It can be used to test if all the PreservedReference objects
    have been replaced by their correct reference since they should all be de-referenced by the end of the process.
    """
    pass

EnforceReferenceLifecycle

class EnforceReferenceLifecycle(Semantic[bool]):
    """
    Ensures that an object id that is used to cache an object for PreservedReferences is not re-used by the interpreter
    by maintaining a reference to all objects cached for the duration of the operation.
    """
    pass

Note

I think it would be a bad idea to turn off EnforceReferenceLifecycle. The only rational for doing this I can think of would be to save on some time and memory complexity, but compared to the rest of the process this semantic is not expensive. It is confusing to determine when this semantic is necessary and its function is a safety net. It attempts to protect the Processor from mixing up object ids, so when this semantic is turned off and the Processor gets confused this will lead to data loss, crashes and undefined behavior as objects will be incorrectly referencing each other.

Simple example#

Lets serialize a data structure with duplicate references

from grave_settings.formatters.json import JsonFormatter

some_list = [1, 2, 3]
some_dict = {
    'foo': some_list,
    'bar': some_list
}

formatter = JsonFormatter()
print(formatter.dumps(some_dict))
Output#
 {
      "foo": [
          1,
          2,
          3
      ],
      "bar": {
          "__class__": "grave_settings.formatter_settings.PreservedReference",
          "ref": "\"foo\""
      }
  }

Note

The preserved reference object is simple but the a note about the ref attribute. It is not guaranteed that the ref attribute will follow any particular format, and so, we should not look at it directly or use it to make decisions without consulting the FormatterSpec. In most cases the FormatterSpec or FormatterContext will expose methods that act as an abstraction layer. This is just an FYI that a formatter or file format may be set up to use a different FormatterSpec then what you may be anticipating if you choose to look at the ref value directly. In fact the formatter built in Custom Formatter does not follow the default behavior.

This way when the structure is deserialized foo and bar will be associated with the same list object as opposed to two separate lists with the same values. Note that this can be weird if you are deserializing Circular References

Not preserving references#

Lets have a quick look at the output if the AutoPreserveReferences semantic was disabled

from grave_settings.formatters.json import JsonFormatter
from grave_settings.semantics import AutoPreserveReferences

some_list = [1, 2, 3]
some_dict = {
    'foo': some_list,
    'bar': some_list
}

formatter = JsonFormatter()
formatter.add_semantics(AutoPreserveReferences(False))  # [1]
print(formatter.dumps(some_dict))

Note [1]

Adding the AutoPreserveReferences on the formatter sets is as a default Semantic for this formatter object. The Serializer Processor has this set to True by default but the formatter defaults will override it. This effectively adds the Semantic to the root frame, not as a frame semantic, meaning that it will propigat through the entire process. No reference will be preserved in the entire hierarchy.

Output#
  {
      "foo": [
          1,
          2,
          3
      ],
      "bar": [
          1,
          2,
          3
      ]
  }

Disabling preserved references dynamically#

Now lets say that you have two objects that reference two two separately identical objects. With one list you want to preserve the reference but with the other you do not. How to we accomplish this?

from grave_settings.formatter_settings import NoRef
from grave_settings.formatters.json import JsonFormatter


class Foo:
    def __init__(self):
        self.list1 = [1, 2, 3]
        self.list2 = [1, 2, 3]

    def to_dict(self, *args, **kwargs):
        return {
            'list1': self.list1,
            'list2': NoRef(self.list2)
        }


class Bar:
    def __init__(self, foo: Foo):
        self.list1 = foo.list1
        self.list2 = foo.list2
        self.foo = foo

    def to_dict(self, *args, **kwargs):
        return {
            'list1': self.list1,
            'list2': NoRef(self.list2),
            'foo': self.foo
        }


formatter = JsonFormatter()
print(formatter.dumps(Bar(Foo())))
Output#
  {
      "__class__": "__main__.Bar",
      "list1": [
          1,
          2,
          3
      ],
      "list2": [
          1,
          2,
          3
      ],
      "foo": {
          "__class__": "__main__.Foo",
          "list1": {
              "__class__": "grave_settings.formatter_settings.PreservedReference",
              "ref": "\"list1\""
          },
          "list2": [
              1,
              2,
              3
          ]
      }
  }

By wrapping the lists in NoRef objects the formatter is instructed to disable preserved references for this object.

Note

NoRef is a simple subclass of AddSemantics which acts in a similar manner, but allows you to attach arbitrary semantics to the wrapped object.

Warning

The same effect can be achieved for the above by using Temporary instead of NoRef but this is only because it is a special case. If the list contained python objects or any other values that the formatter may transform during its operation, then the original objects will have data overriden. This is because Temporary signals to the formatter that, not only is the object not referencable, but it also can safely be mutated and destroyed. It “belongs” to the formatter after the formatter unwraps it. Temporary is typically used in handlers