-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Connector APIs] Connector update last sync info, status, error #2641
base: main
Are you sure you want to change the base?
Conversation
connectors/protocol/connectors.py
Outdated
await self.index.api.connector_update_last_sync_info( | ||
connector_id=self.id, last_sync_info=last_sync_information | ||
) | ||
await self.index.api.connector_update_status( | ||
connector_id=self.id, status=Status.CONNECTED.value | ||
) | ||
await self.index.api.connector_update_error( | ||
connector_id=self.id, error=None | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a little concerning - we do 3 calls to do single thing? Should we merge them into one call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could unify error and status endpoint in a single call (requires small ES adjustment). Set status
as a function of error being null or non-null.
However, I think we should maintain the _last_sync as a separate call. Integrating them would require expanding the last_sync_info endpoint with even more values in the request body (e.g. it would need to take error) - doable but then we are converging to _update
like functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, uniting status update + error update in one endpoint and leaving update_last_sync_info
endpoint separately is a good way forward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some questions - currently some of the calls are updating several fields at once in a way that does not make connector enter invalid state.
With new api changes I see that it's possible that only partial updates are applied to the records in connectors index (e.g. error is populated, but status is not changed) if something goes wrong - CTRL+C, network blip, Elasticsearch crashing, etc
connectors/protocol/connectors.py
Outdated
await self.index.api.connector_update_error( | ||
connector_id=self.id, error=error | ||
) | ||
await self.index.api.connector_update_status( | ||
connector_id=self.id, status=Status.ERROR.value | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here - this is an atomic action "mark as error" that updates status and writes error, should it be single action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that we could unify this in a following way:
- we just call
_error
endpoint, the logic in ES could set the status depending iferror
is null or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be perfect IMO!
connectors/protocol/connectors.py
Outdated
await self.index.api.connector_update_status( | ||
connector_id=self.id, status=connector_status.value | ||
) | ||
await self.index.api.connector_update_error( | ||
connector_id=self.id, error=job_error | ||
) | ||
await self.index.api.connector_update_last_sync_info( | ||
connector_id=self.id, last_sync_info=last_sync_information | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And same thing here - is there any chance we unite these three into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See answer above about other 3 calls
@artem-shelkovnikov Thank you for your review. While I agree that making several calls in a non-atomic block is not ideal, if we pursue the path of creating a single call that updates the connector document in the given scenario, we could end up with numerous endpoints or go back to what the OG I propose we do a slight improvement to error endpoint: if we set |
@jedrazb my main concern is not performance, but changing the system into invalid state with API - even if it happens for 1-2 seconds. Optimisation of calls on the other hand I think is not as important - as you mentioned, we will end up with just |
Tbh as long as we can update current error string and status we should be fine (so let's optimise this into a single request, update error with status side-effect). The |
Closes https://github.com/elastic/search-team/issues/7792
Use update error api, update status API and update last sync stats api to manage connector lifecycle during syncs.
This feature is behind a feature flag that is disabled by default.
Validation
Pre-Review Checklist
config.yml.example
)v7.13.2
,v7.14.0
,v8.0.0
)