Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Connector APIs] Connector update last sync info, status, error #2641

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jedrazb
Copy link
Member

@jedrazb jedrazb commented Jun 17, 2024

Closes https://github.com/elastic/search-team/issues/7792

Use update error api, update status API and update last sync stats api to manage connector lifecycle during syncs.

This feature is behind a feature flag that is disabled by default.

Validation

  • added unit tests
  • tested e2e on all sync types

Pre-Review Checklist

  • this PR does NOT contain credentials of any kind, such as API keys or username/passwords (double check config.yml.example)
  • this PR has a meaningful title
  • this PR links to all relevant github issues that it fixes or partially addresses
  • this PR has a thorough description
  • Covered the changes with automated tests
  • Tested the changes locally
  • Added a label for each target release version (example: v7.13.2, v7.14.0, v8.0.0)

@jedrazb jedrazb marked this pull request as ready for review June 25, 2024 07:13
@jedrazb jedrazb requested a review from a team June 25, 2024 07:13
Comment on lines 747 to 755
await self.index.api.connector_update_last_sync_info(
connector_id=self.id, last_sync_info=last_sync_information
)
await self.index.api.connector_update_status(
connector_id=self.id, status=Status.CONNECTED.value
)
await self.index.api.connector_update_error(
connector_id=self.id, error=None
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a little concerning - we do 3 calls to do single thing? Should we merge them into one call?

Copy link
Member Author

@jedrazb jedrazb Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could unify error and status endpoint in a single call (requires small ES adjustment). Set status as a function of error being null or non-null.

However, I think we should maintain the _last_sync as a separate call. Integrating them would require expanding the last_sync_info endpoint with even more values in the request body (e.g. it would need to take error) - doable but then we are converging to _update like functionality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, uniting status update + error update in one endpoint and leaving update_last_sync_info endpoint separately is a good way forward

Copy link
Member

@artem-shelkovnikov artem-shelkovnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some questions - currently some of the calls are updating several fields at once in a way that does not make connector enter invalid state.

With new api changes I see that it's possible that only partial updates are applied to the records in connectors index (e.g. error is populated, but status is not changed) if something goes wrong - CTRL+C, network blip, Elasticsearch crashing, etc

Comment on lines 771 to 776
await self.index.api.connector_update_error(
connector_id=self.id, error=error
)
await self.index.api.connector_update_status(
connector_id=self.id, status=Status.ERROR.value
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - this is an atomic action "mark as error" that updates status and writes error, should it be single action?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that we could unify this in a following way:

  • we just call _error endpoint, the logic in ES could set the status depending if error is null or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be perfect IMO!

Comment on lines 832 to 840
await self.index.api.connector_update_status(
connector_id=self.id, status=connector_status.value
)
await self.index.api.connector_update_error(
connector_id=self.id, error=job_error
)
await self.index.api.connector_update_last_sync_info(
connector_id=self.id, last_sync_info=last_sync_information
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And same thing here - is there any chance we unite these three into one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See answer above about other 3 calls

@jedrazb
Copy link
Member Author

jedrazb commented Jun 25, 2024

With new api changes I see that it's possible that only partial updates are applied to the records in connectors index (e.g. error is populated, but status is not changed) if something goes wrong - CTRL+C, network blip, Elasticsearch crashing, etc

@artem-shelkovnikov Thank you for your review. While I agree that making several calls in a non-atomic block is not ideal, if we pursue the path of creating a single call that updates the connector document in the given scenario, we could end up with numerous endpoints or go back to what the OG _update endpoint was doing (single endpoint many fields in payload).

I propose we do a slight improvement to error endpoint: if we set error to null it means that connector is in connected state (beginning of new sync), for non-null errors it means we ended up in error state. This can eliminate one call from that block.

@artem-shelkovnikov
Copy link
Member

@jedrazb my main concern is not performance, but changing the system into invalid state with API - even if it happens for 1-2 seconds.

Optimisation of calls on the other hand I think is not as important - as you mentioned, we will end up with just _update endpoint in the end.

@jedrazb
Copy link
Member Author

jedrazb commented Jun 26, 2024

my main concern is not performance, but changing the system into invalid state with API - even if it happens for 1-2 seconds.

Tbh as long as we can update current error string and status we should be fine (so let's optimise this into a single request, update error with status side-effect).

The last_sync_* stuff could in theory fail as this is purely for informational purposes in Kibana, see:
Our operational logic doesn't depend on this in elastic/connectors repo or Kibana (reference, usage)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants