You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A region did a conf change and a new replica is added; the new replica becomes the leader
The client has the old region info without the conf change in the region cache
Then when the client tries to access the region's leader:
The client tries to access one of the region's old replicas
The TiKV has both mismatching epoch and leader. Due to it's implementation, TiKV returns the NotLeader error prior to EpochNotMatch, carrying the new leader's peer info
The client tries to switch to the new leader, but as it has an old region info without the conf change, the leader is missing in its peer list. It then invalidates the region.
Then it will trigger reloading the region info from PD. But as the information held by the current PD leader is also out of date, it failed to get the valid region information for sending the request until the PD updates the information. This causes the access fails for about two minutes.
A simple solution to this problem would be: carrying the full region information when reporting the NotLeader error, and use it to update the region cache if the returned region info has larger epoch than that in region cache.
But in this issue (tikv/client-go#1398), the problem was more deeply considered. We may need to think about what we can do further about improving the client's region info updating.
The text was updated successfully, but these errors were encountered:
Ref: tikv/client-go#1398
We recently met such an issue:
Then when the client tries to access the region's leader:
NotLeader
error prior toEpochNotMatch
, carrying the new leader's peer infoA simple solution to this problem would be: carrying the full region information when reporting the
NotLeader
error, and use it to update the region cache if the returned region info has larger epoch than that in region cache.But in this issue (tikv/client-go#1398), the problem was more deeply considered. We may need to think about what we can do further about improving the client's region info updating.
The text was updated successfully, but these errors were encountered: