-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing a mixin for Flatbuffers #79
Comments
I'm not the author so won't speak of what's possible, but upon review of the code for any of the json, msgpack, or yaml serializers it appears that all of the code building happens upon conversion to a dictionary. There is no code building being applied to serialize/deserialize objects for the formats supported. I do think this could be done without a code building strategy, by leveraging a cache on the mixin you create that keeps a mapping of field -> method call. From there you could handle both serialization and deserialization by executing the mappings against the flatbuffer field -> method lookup table. Not all that familiar with flatbuffers, but maybe something like... [Edit]: Simplified to its essence. from typing import Any, Mapping, Optional, Type, TypeVar
from mashumaro.mixins.dict import DataClassDictMixin
from mashumaro.serializer.json import DEFAULT_DICT_PARAMS
from typing_extensions import ClassVar, Protocol
T = TypeVar("T", bound="DataClassFlatBufferMixin")
def get_encoder(type: Type[T]):
# use type and params to lookup module and methods
field_encoders = {
'field_name': lambda buffer, **kwargs: bytearray() # method call here
}
def encoder(buffer: bytearray, obj: Mapping[str, Any]):
for key in obj.keys():
field_encoders[key](buffer)
return buffer
return encoder
def get_decoder(type: Type[T]):
# use type and params to lookup module and methods
field_decoders = {
'field_name': lambda buffer, **kwargs: 0 # method call here
}
def decoder(buffer: bytearray):
return {key: field_decoder(buffer) for key, field_decoder in field_decoders.items()}
return decoder
class Decoder(Protocol):
def __call__(self, buffer: bytearray) -> Mapping[str, Any]: ...
class Encoder(Protocol):
def __call__(self, buffer: bytearray, obj: Mapping[str, Any]) -> bytearray: ...
class DataClassFlatBufferMixin(DataClassDictMixin):
__slots__ = ()
__flatbuffer_encoder: ClassVar[Optional[Encoder]]
__flatbuffer_decoder: ClassVar[Optional[Decoder]]
# similar to a metaclass (but simpler)
# allows setting class variables on any subclass of this type
def __init_subclass__(cls: Type[T], **kwargs):
super().__init_subclass__(**kwargs)
cls.__flatbuffer_encoder = None
cls.__flatbuffer_decoder = None
def to_flatbuffer(self: T, buffer: bytearray):
clazz = type(self)
if not clazz.__flatbuffer_encoder:
clazz.__flatbuffer_encoder = get_encoder(type(self))
return clazz.__flatbuffer_encoder(
buffer,
self.to_dict(**dict(DEFAULT_DICT_PARAMS)),
)
@classmethod
def from_flatbuffer(
cls: Type[T],
data: bytearray,
) -> T:
if not cls.__flatbuffer_decoder:
cls.__flatbuffer_decoder = get_decoder(cls)
return cls.from_dict(
cls.__flatbuffer_decoder(data),
**dict(DEFAULT_DICT_PARAMS),
) |
For reference, what I currently do is something like this: Flatbuffer Schema:
Using generated code (API generated by Flatbuffer compiler):
And then I have a dataclass defined like this:
which is used by my encoder library, which operates based on the dataclass definition and "generates" code:
|
@BrutalSimplicity thanks for that suggestion. Do you think your idea would work with the "string" case above? For that I would need to call a few functions: Note that each string in this code is normally generated from the dataclass fields, its hardcoded here for brevity, so the actual code would have a few extra calls.
I think those getattr calls are going to be expensive(?), however, perhaps its possible to emit them as a code object and use it inplace of the lambda as you suggested. Its seems like it would work. |
Hi, guys! I don't have experience with FlatBuffers, so it'll take me time to dive into this. But we can create subpackage |
Is your feature request related to a problem? Please describe.
I'm interested in supporting Flatbuffers via a Mixin. I already have a DataClass based encoder/decoder which uses
getattr(...)
to call the generated Flatbuffer code as well as loading modules withimportlib.import_module()
. However, that could be much faster if the code to do the encoding/decoding would be generated once for the schema.Describe the solution you'd like
So far I can see how to implement the various serialization hooks (pre/post), but what would be the best way to implement the field serialization.
Generally, the code for each hooks needs to; based on the table/field name; load a module, call getattr() to find the right method to call, and then somehow emit the code in a way which can be used by the code builder. Possibly a default_encoder (Encoder)? Essentially, at some point, I need the list of fields, and a way to emit the necessary function calls to encode/decode data.
The pre/post hooks would take care of the "framing" of the Flatbuffer table (i.g. calling
Start()
andEnd()
as well as creating a buffer at some point).Describe alternatives you've considered
Currently I use getattr() calls each time a DataClass is serialized. So, I would like to generate the code only once, based on the DataClass, and thus get hopefully a significant performance boost.
Additional context
If its feasible, I don't mind to do implementation of the Mixin.
The text was updated successfully, but these errors were encountered: