-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to cache Roborock maps instead of always reloading #112047
base: dev
Are you sure you want to change the base?
Conversation
Hey there @humbertogontijo, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
"""Take the serialized dict and convert to object.""" | ||
return RoborockImageExtraStoredData( | ||
map_flag=serialized_dict["map_flag"], | ||
cached_image=base64.b64decode(serialized_dict["cached_image"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this will make the restored entity cache huge as it now contains a multiple whole images in base64 (which takes up more space than original image).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So from what it seems for me (on my map at least)
The response from the api has a length of 269612 bytes.
The cached map aka the image has a length of 7909 bytes
And the base64 encoded serialized image has a length of 10548 bytes.
I'm open to a better way of storing the image, I just wanted to make sure that I was able to consistently serialize and serialize the image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better stored on disk outside of the restore entity cache. That cache is written to disk every 15min. In my production the restore state cache is 123KB. That would increase with 25% (I have 3 maps).
If you write it to disk, make sure it's cleaned up when the config entry is removed. Maybe you can make a folder inside .storage
: <config>/.storage/roborock_<config entry ID>/maps
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be fine by me. My only question is how often do I write it to the disk? Should it only be written to the disk on config entry unloading? Or should it get done at the same point as the restore entity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also remembered that the response from the api before we decompress it is a gzip file. So I could also save that to disk and that would save some space. I'm not 100% sure a clean way to do that as it is kind of in the middle of the process within our package, but if we want it to be even smaller, I can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think firing off an async write when the map has changed is fine, and could include some rate limiting or debouncing (e.g. you don't necessarily need to write every time the robot moves).
I don't think its worth it to do anything fancy there in the middle of the library. I've gone down that road when writing lots of objects where nest passes in a storage interface into the client library for saving images/videos nest media source and the interface but i don't think its worth it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to do tests - but I took what you did in Nest and modified it. No injection within the library, but I did think it would be useful to have a central object for handling the storage.
I'll try to get tests done soon, but for now I'd appreciate any feedback on the implementation. Will mark as draft until I get tests done.
yeah, you can use our storage helper. create a file |
Marking as draft until i have a chance to do that |
Finally got around to it. Should be ready for a look over |
return self._hass.config.path(f"{MAP_PATH}/{self._entry_id}/{map_name}") | ||
|
||
def exec_load_maps(self, map_names: list[str]) -> list[bytes | None]: | ||
"""Load map content. Should be called in executor thread.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it would be safer to separate the map loading disk i/o from updating the python object state (_data
). That is, it seems like the save and load can happen from different executor threads.
Perhaps provide async interfaces and let those update the object state and defer to the executor threads for I/O only.
When writing, i think its fine to only update _data
after the map write succeeds rather than writing and handling a rollback case on error.
|
||
import dataclasses | ||
import logging | ||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer pathlib
for all the filename and file/directory manipulation. It will make some of the path concatenation and read/write operations a little less code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't used pathlib before, but I think I did it all correctly.
Although one test is failing now, perhaps i'm patching incorrectly. I will have to look later.
|
||
async def async_load_map_info(self, coord_duid: str) -> dict[int, RoborockMapInfo]: | ||
"""Load a coordinator's map info.""" | ||
map_info: dict[str, dict[str, dict]] | None = await self._store.async_load() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads from disk every time load is called so the use of self._map_info
does not seem to be adding value. Is the idea to treat _map_info
as a cache or should it be dropped entirely? It seems like the intent was to load once at startup, then always read from it and write through directly to disk every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only called once. _map_info is a cache so that we can update the coord that changed and then rewrite it.
The coord that calls the save does not have access to all of the data, so roborock storage holds it instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, my thinking since its not used when reading, it could just also be re-read before writing to simplify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would mean I'm doubling the IO everytime I am writing to the disk - I can do that but it seems a odd. Happy to do it if you think that is best
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed updates were rare, so i was trying to see about removing complexity. If this is really a performance sensitive flow then it could also be improved by not storing all the maps in a single file since updating maps for one coord_duid
requires updating all of them.
If we want to make this work, then there are some either potential bugs here or semantics that need to be enforced in how this is used:
- If you call save before calling load then the existing map data gets overwritten
- Calling remove map info doesn't clear the map info so then calling save will write back old map data
- My assumption is that two different coordinators can call
load
andsave
interleaved and i'm not sure how safe that is.
These work today because of assumptions in exactly how these are used by the callers today, which i think make this a little brittle, or spread logic across two files.
Removing _map_info
sidesteps needing to address these bugs / or add documentation explaining assumptions / or add invariant checking to enforce a specific call pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what makes this fancy is that its trying to do two things (1) provide a cache of the entire disk and (2) provide partial update semantics with read/write of a subset of the entire cached data.
Separating the responsibilities and having the cache at a lower level, reading /writing the entire dataset would make the semantics simpler and easier to reason about. (e.g. cache the entire data on load, overwrite the entire data on save.
But more concretely i think these fixes can make what is here work:
- only ever read from disk once, always prefer to read from the existing data if it was already read
- invalidate map info when removing it from disk
- consider using
async_delay_save
on write to batch writes together from multiple devices
Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍 |
|
||
async def async_load_map_info(self, coord_duid: str) -> dict[int, RoborockMapInfo]: | ||
"""Load a coordinator's map info.""" | ||
map_info: dict[str, dict[str, dict]] | None = await self._store.async_load() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, my thinking since its not used when reading, it could just also be re-read before writing to simplify.
) | ||
# To prevent weirdness - if there is a map without a name | ||
# - set its name to its flag | ||
if roborock_map.mapFlag not in self.maps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression is what is happening here is the we set self.maps =
to whatever is currently in the local storage, then we ask for the list of maps from the API. The content of self.maps
is never updated with a new list of names or rooms, which is not necessarily what I would expect if the API returned newer data. I may be misunderstanding though: Is this ok for another reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing my thoughts as I reread my code - so stick with me.
We load the saved map info from the storage.
We then get the list of maps from the api.
If we haven't seen this map before, we should add it to self._maps.
The api doesn't have room information, that information can only be gotten when we have that current map selected. so we can't get it here. instead, room updates would be done in get_rooms().
I think the point about names is fair though, so I will update the name here if the name has changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although on second thought - I don't think entities can handle dynamic changing of attribute name, so i'm not sure if that makes sense to handle. I will log a warning for now and you can give input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK stepping back: What is the intent here of refreshing this regularly? What are the cases in scope and not in scope.
As an alternative: Only query on startup if we don't want to handle all the corner cases. I'm having trouble understanding why some are handled and some are not and so not handling any of them after startup seems appealing to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is purpose is the local disk storage of metadata / map info? The API is always consulted here anyway first and I don't see it read on its own anywhere.
flag=roborock_map.mapFlag, name=map_name, rooms={} | ||
) | ||
update = True | ||
if update: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is expected behavior if the API no longer returns a map? I might expect it to be removed from the local disk cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? It probably should be - and the entity should be removed as well. Is the proper way to remove the entity just using the entity registry?
@@ -64,25 +65,33 @@ def __init__( | |||
map_flag: int, | |||
starting_map: bytes, | |||
map_name: str, | |||
roborock_storage: RoborockStorage, | |||
create_map: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now starting map can either be map bytes or image bytes, and it seems like it would there should be a way to pick one to keep the abstraction here simpler. Can RoborockMap
just deal with one or the other? Setting a more cirps boundary between the storage class, the coordinator, and the entity will avoid all parts of the system needing to know everything about how the caching works. (There are also multiple levels of caching happening here now so you can consider how to simplify that in the future as well, putting all the cache in a single layer)
Co-authored-by: Allen Porter <[email protected]>
There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days. |
Breaking change
Proposed change
After talking with Paulus, he has multiple maps and the time it takes to startup Roborock is costly (about 40s) because of the nature of how we have to load the images. Now we only have to do that once because they are cached.
lightly relates to #97311 and conversations on caching there.
Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
..coveragerc
.To help with the load of incoming pull requests: