Feature/add pg source #159

xuzh2024 · 2024-09-19T09:56:53Z

No description provided.

shanshuo0918 · 2024-09-26T03:08:43Z

indexer/jobs/source_job/pg_source_job.py

+
+        session = self._service.get_service_session()
+        try:
+            logs = session.query(Logs).filter(query_filter).order_by(*Logs.__query_order__).all()


I'm a bit worry about query performance here.

Also do you need order since you are getting all the rows? Or maybe do in memory ordering?

I will add partition key "timestamp" to the query to enhance its performance.
The order statement would only sort the results set, and I use it for uniform processing.
If the order statement has a significant negative impact on execution, I will do in memory ordering.

shanshuo0918 · 2024-09-26T05:08:49Z

indexer/jobs/source_job/pg_source_job.py

+        session = self._service.get_service_session()
+        try:
+            transactions = (
+                session.query(Transactions).filter(query_filter).order_by(*Transactions.__query_order__).all()


Same question here. Need to test performance

Check whether enough index.

Index has been checked.
I will add partition key "timestamp" to the query to enhance its performance.

shanshuo0918 · 2024-09-26T05:28:33Z

indexer/jobs/source_job/pg_source_job.py

+        if self._service is None:
+            raise FastShutdownError("-pg or --postgres-url is required to run PGSourceJob")
+
+        self.build_dependency = {}


What does this build_dependency do?

This is legacy code in development and will be removed in the next commit.

shanshuo0918 · 2024-09-26T05:38:39Z

cli/stream.py

-        output_types = list(set(parse_output_types))
+    if source_types is None and source_path.startwith("postgresql://"):
+        source_types = "block,transaction,log"
+    if source_types:


Should we add this extra step or we should enforce user to list all the source types?

Instead of this step, I set a default value to “--source-types”

shanshuo0918 · 2024-09-26T05:41:35Z

cli/stream.py

@@ -375,6 +400,9 @@ def stream(
        batch_web3_provider=ThreadLocalProxy(lambda: get_provider_from_uri(provider_uri, batch=False)),
        job_scheduler=job_scheduler,
        sync_recorder=create_recorder(sync_recorder, config),
+        limit_reader=create_limit_reader(


what if user put other value in source_path?

In unexpected cases, limit_reader will request the latest block from rpc.
if source_path specify a csv_source, parameter checking would ensure that the rpc_limit_reader does not interfere with the program

xuzh2024 added 4 commits September 14, 2024 19:21

new feature: read data from postgres

82408a7

new feature: pg_source_job can use filter to specify required data

7040f0c

bug fixed and performance optimized

2a38c81

make format

88ae9a5

xuzh2024 requested review from ideal93 and shanshuo0918 September 19, 2024 09:56

xuzh2024 and others added 5 commits September 19, 2024 20:03

fix dict_to_dataclass error

63c9857

fix iterator error

a86884e

fix iterator error

635adae

Merge branch 'pre-release/v0.3.0' into feature/add-pg-source

a7e58a9

merged cli load and stream & add limit reader & add param --source-type

e17d152

shanshuo0918 reviewed Sep 26, 2024

View reviewed changes

xuzh2024 and others added 15 commits September 26, 2024 16:13

optimize code, set default value to --source-types

7de47d9

optimize code logic, to prevent unexpected error

e912aa0

add partition key to enhance query performance.

d43952d

bug fixed

dcc0894

fixed data convert error

452b4a2

Changes the sort operation from the database to memory

bd5c5a9

add logger info

d353502

fix rpc_limit_reader & parameter required

3e645b8

fix filter logic

3a23ed4

fix filter logic

53e59a6

make format

1b17e8e

resolved conflict

fdb95f6

Merge branch 'pre-release/v0.3.0' into feature/add-pg-source

845f472

resolved conflict

1cc103b

make format

3f13d10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/add pg source #159

Feature/add pg source #159

xuzh2024 commented Sep 19, 2024

shanshuo0918 Sep 26, 2024

xuzh2024 Sep 26, 2024

shanshuo0918 Sep 26, 2024

shanshuo0918 Sep 26, 2024

xuzh2024 Sep 26, 2024

shanshuo0918 Sep 26, 2024

xuzh2024 Sep 26, 2024

shanshuo0918 Sep 26, 2024

xuzh2024 Sep 26, 2024

shanshuo0918 Sep 26, 2024

xuzh2024 Sep 26, 2024

Feature/add pg source #159

Are you sure you want to change the base?

Feature/add pg source #159

Conversation

xuzh2024 commented Sep 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment