v1.4.0
Client
-
Implement a
use
feature, enabling a module to use an existing module by overriding its inputs or initial block. (Inputs should have the same output type than override module's inputs).
Check a usage of this new feature on the substreams-db-graph-converter repository. -
Fix panic when using '--header (-H)' flag on
gui
command -
When packing substreams, pick up docs from the README.md or README in the same directory as the manifest, when top-level package.doc is empty
-
Added "Total read bytes" summary at the end of 'substreams run' command
Server performance in "production-mode"
Some redundant reprocessing has been removed, along with a better usage of caches to reduce reading the blocks multiple times when it can be avoided. Concurrent requests may benefit the other's work to a certain extent (up to 75%!)(MISSING)
-
All module outputs are now cached. (previously, only the last module was cached, along with the "store snapshots", to allow parallel processing). (this will increase disk usage, there is no automatic removal of old module caches)
-
Tier2 will now read back mapper outputs (if they exist) to prevent running them again. Additionally, it will not read back the full blocks if its inputs can be satisfied from existing cached mapper outputs.
-
Tier2 will skip processing completely if it's processing the last stage and the
output_module
is a mapper that has already been processed (ex: when multiple requests are indexing the same data at the same time) -
Tier2 will skip processing completely if it's processing a stage that is not the last, but all the stores and outputs have been processed and cached.
-
The "partial" store outputs no longer contain the trace ID in the filename, allowing them to be reused. If many requests point to the same modules being squashed, the squasher will detect if another Tier1 has squashed its file and reload the store from the produced full KV.
-
Scheduler modification: a stage now waits for the previous stage to have completed the same segment before running, to take advantage of the cached intermediate layers.
-
Improved file listing performance for Google Storage backends by 25%!
Operator concerns
-
Tier2 service now supports a maximum concurrent requests limit. Default set to 0 (unlimited).
-
Readiness metric for Substreams tier1 app is now named
substreams_tier1
(was mistakenly calledfirehose
before). -
Added back deadiness metric for Substreams tiere app (named
substreams_tier2
). -
Added metric
substreams_tier1_active_worker_requests
which gives the number of active Substreams worker requests a tier1 app is currently doing against tier2 nodes. -
Added metric
substreams_tier1_worker_request_counter
which gives the total Substreams worker requests a tier1 app made against tier2 nodes.