Preserving References#
Preserving references means maintaining “is” relationships through the serialization / deserialization process. The automatic preserving of references is managed by the Processor
and the behavior is set by the following Semantics
:
class AutoPreserveReferences(Semantic[bool]):
"""
The formatter will keep track of objects that are referenced more than once in the object hierarchy and automatically
convert subsequent instanced of the same object to a PreservedReference
"""
pass
class ResolvePreservedReferences(Semantic[bool]):
"""
Preserved References are resolved by the formatter and never given to the object. This may be slower. but
it ensures that the object will never have a property set that is of type PreservedReference. When this is not
present the formatter should not resolve the preserved references. Objects can resolve them by subscribing to the
context objects
"""
pass
DetonateDanglingPreservedReferences
class DetonateDanglingPreservedReferences(Semantic[bool]):
"""
This will call a method that raises an exception if any tracked PreservedReference has not been flagged for garbage
collection at the end of the deserialization process. It can be used to test if all the PreservedReference objects
have been replaced by their correct reference since they should all be de-referenced by the end of the process.
"""
pass
class EnforceReferenceLifecycle(Semantic[bool]):
"""
Ensures that an object id that is used to cache an object for PreservedReferences is not re-used by the interpreter
by maintaining a reference to all objects cached for the duration of the operation.
"""
pass
Note
I think it would be a bad idea to turn off EnforceReferenceLifecycle
. The only rational for doing this I can think of would be to save on some time and memory complexity, but compared to the rest of the process this semantic is not expensive. It is confusing to determine when this semantic is necessary and its function is a safety net. It attempts to protect the Processor
from mixing up object ids, so when this semantic is turned off and the Processor
gets confused this will lead to data loss, crashes and undefined behavior as objects will be incorrectly referencing each other.
Simple example#
Lets serialize a data structure with duplicate references
from grave_settings.formatters.json import JsonFormatter
some_list = [1, 2, 3]
some_dict = {
'foo': some_list,
'bar': some_list
}
formatter = JsonFormatter()
print(formatter.dumps(some_dict))
{
"foo": [
1,
2,
3
],
"bar": {
"__class__": "grave_settings.formatter_settings.PreservedReference",
"ref": "\"foo\""
}
}
Note
The preserved reference object is simple but the a note about the ref
attribute. It is not guaranteed that the ref
attribute will follow any particular format, and so, we should not look at it directly or use it to make decisions without consulting the FormatterSpec
. In most cases the FormatterSpec
or FormatterContext
will expose methods that act as an abstraction layer. This is just an FYI that a formatter or file format may be set up to use a different FormatterSpec
then what you may be anticipating if you choose to look at the ref
value directly. In fact the formatter built in Custom Formatter does not follow the default behavior.
This way when the structure is deserialized foo
and bar
will be associated with the same list object as opposed to two separate lists with the same values. Note that this can be weird if you are deserializing Circular References
Not preserving references#
Lets have a quick look at the output if the AutoPreserveReferences
semantic was disabled
from grave_settings.formatters.json import JsonFormatter
from grave_settings.semantics import AutoPreserveReferences
some_list = [1, 2, 3]
some_dict = {
'foo': some_list,
'bar': some_list
}
formatter = JsonFormatter()
formatter.add_semantics(AutoPreserveReferences(False)) # [1]
print(formatter.dumps(some_dict))
Note [1]
Adding the AutoPreserveReferences
on the formatter sets is as a default Semantic
for this formatter object. The Serializer
Processor
has this set to True
by default but the formatter defaults will override it. This effectively adds the Semantic to the root frame, not as a frame semantic, meaning that it will propigat through the entire process. No reference will be preserved in the entire hierarchy.
{
"foo": [
1,
2,
3
],
"bar": [
1,
2,
3
]
}
Disabling preserved references dynamically#
Now lets say that you have two objects that reference two two separately identical objects. With one list you want to preserve the reference but with the other you do not. How to we accomplish this?
from grave_settings.formatter_settings import NoRef
from grave_settings.formatters.json import JsonFormatter
class Foo:
def __init__(self):
self.list1 = [1, 2, 3]
self.list2 = [1, 2, 3]
def to_dict(self, *args, **kwargs):
return {
'list1': self.list1,
'list2': NoRef(self.list2)
}
class Bar:
def __init__(self, foo: Foo):
self.list1 = foo.list1
self.list2 = foo.list2
self.foo = foo
def to_dict(self, *args, **kwargs):
return {
'list1': self.list1,
'list2': NoRef(self.list2),
'foo': self.foo
}
formatter = JsonFormatter()
print(formatter.dumps(Bar(Foo())))
{
"__class__": "__main__.Bar",
"list1": [
1,
2,
3
],
"list2": [
1,
2,
3
],
"foo": {
"__class__": "__main__.Foo",
"list1": {
"__class__": "grave_settings.formatter_settings.PreservedReference",
"ref": "\"list1\""
},
"list2": [
1,
2,
3
]
}
}
By wrapping the lists in NoRef
objects the formatter is instructed to disable preserved references for this object.
Note
NoRef
is a simple subclass of AddSemantics
which acts in a similar manner, but allows you to attach arbitrary semantics to the wrapped object.
Warning
The same effect can be achieved for the above by using Temporary
instead of NoRef
but this is only because it is a special case. If the list contained python objects or any other values that the formatter may transform during its operation, then the original objects will have data overriden. This is because Temporary
signals to the formatter that, not only is the object not referencable, but it also can safely be mutated and destroyed. It “belongs” to the formatter after the formatter unwraps it. Temporary
is typically used in handlers