You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An old version with only the bare minimum is available on SO and a less minimal version on gist. After perf test on setitem and getitem indirect access, they are as fast as original dict! We should use these versions as references and reduce complexity of newer fdict. /EDIT: no in fact it is because the test uses di[str(j)] = {}, which created a dict in older releases instead of not doing anything. If you add that check, then fdict is used for subdicts, and we get about the same performances as currently. The 10x slowdown is thus probably due to the fdict instanciation + string manipulation.
In practice fdict setitem and getitem are 10x slower than dict when using indirect access (direct access is as fast as dict). After profiling, it seems most time is taken by string comparisons: join and [:-1] == delimiter. Maybe we could internally replace strings by a bitarray representation? But wouldn't it make things even slower?
Fastview mode is very slow to setitem, because of metadata building which is O(m*l) where m is the number of parent per leaf and l the number of leaves added, thus quadratic time, but it should be possible to do it in linear time by walking each parent node only once: we should build the list of parent nodes in a top down approach, instead of bottom-up as is currently done.
getitem currently returns another fdict() instance with a reference to the same internal dict and exactly the same parameters EXCEPT one: rootpath, a simple string. Maybe we could find another pythonic way to reuse the parent fdict but just define another rootpath for the child? In other words, an exact copy except for one field.
Move fastview mode fdict to its own class. Fastview will maybe be a bit slower but standard fdict will be faster!
Not a speed optimization but the benchmarks.py tests for getitem also include the time for setitem. We should fix that to only see getitem.
The text was updated successfully, but these errors were encountered:
Speed optimizations todo list:
di[str(j)] = {}
, which created a dict in older releases instead of not doing anything. If you add that check, then fdict is used for subdicts, and we get about the same performances as currently. The 10x slowdown is thus probably due to the fdict instanciation + string manipulation.The text was updated successfully, but these errors were encountered: