Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nighlt OOD preview version has an incomplete update #236

Closed
lizhihongTest opened this issue Apr 21, 2023 · 6 comments
Closed

Nighlt OOD preview version has an incomplete update #236

lizhihongTest opened this issue Apr 21, 2023 · 6 comments
Labels
bug Something isn't working OOD-daemon The OOD-daemon basic service

Comments

@lizhihongTest
Copy link
Collaborator

Describe the bug
I found that the version update on my OOD is incomplete, some services are 744, some services are 750, and they have not been updated to the latest version 753

Details

root@debian:/home/bucky# cat /cyfs/etc/ood-daemon/device-config.toml 
[[service]]
id = "9tGpLNnDwJ1nReZqJgWev5eoe23ygViGDC4idnCK1Dy5"
name = "app-manager"
fid = "7jMmeXZTUL3CHp2JmRmkD5mYBTF7nbwmZbM66ynYWGMp/x86_64-unknown-linux-gnu.zip"
version = "1.1.0.744"
enable = true
target_state = "Run"

[[service]]
id = "9tGpLNnabHoTxFbodTHGPZoZrS9yeEZVu83ZVeXL9uVr"
name = "chunk-manager"
fid = "7jMmeXZeJWzYrub42GyKSZafjstZYgksTakMuha2YM2a/x86_64-unknown-linux-gnu.zip"
version = "1.1.0.750-preview"
enable = true
target_state = "Run"

[[service]]
id = "9tGpLNnDpa8deXEk2NaWGccEu4yFQ2DrTZJPLYLT7gj4"
name = "file-manager"
fid = "7jMmeXZYqNGKFSRbdAbB7BQSrD7cNbU1CGQ9n89x7oc2/x86_64-unknown-linux-gnu.zip"
version = "1.1.0.750-preview"
enable = true
target_state = "Run"

[[service]]
id = "9tGpLNnQnReSYJhrgrLMjz2bFoRDVKP9Dp8Crqy1bjzY"
name = "gateway"
fid = "7jMmeXZdfYRLBsULujdn7wV9VmvN24rydqFY7TZM9Ab3/x86_64-unknown-linux-gnu.zip"
version = "1.1.0.750-preview"
enable = true
target_state = "Run"

[[service]]
id = "9tGpLNnTdsycFPRcpBNgK1qncX6Mh8chRLK28mhNb6fU"
name = "ood-daemon"
fid = "7jMmeXZck8eSjPSkX2FB5LKWvNqMYco3ti4m4mxihssb/x86_64-unknown-linux-gnu.zip"
version = "1.1.0.744"
enable = true
target_state = "Run

To Reproduce
occasional problem

Expected behavior

  • OOD According to the current mechanism, all services should have the same version number
  • OOD should be updated to the latest 753 version, it is not

System information
OS : Debian
ood-daemon_699_rCURRENT.log

@lizhihongTest lizhihongTest added the bug Something isn't working label Apr 21, 2023
@lurenpluto lurenpluto added the OOD-daemon The OOD-daemon basic service label Apr 21, 2023
@lurenpluto
Copy link
Member

This should be the problem of inconsistent service version update of ood-daemon

ood-daemon follows the following process when updating the version

1. Read the service version list of the current ood

The purpose of this is to get a version list with the expected version of each service, usually read from the meta chain, but it can also be configured completely locally

If pulled from the chain, it is currently the AppList object

pub type AppListId = NamedObjectId<AppListType>;
pub type AppList = NamedObjectBase<AppListType>;

2. Select the version of the service

This is to pull the list of available versions of each service from the meta chain, which itself is a DecApp object, and the currently available versions of the service are stored in the mut body, and then select the appropriate service version according to the version information read in step 1

Every time an available version of the service is released, it will be updated to this DecApp object and republished to the chain

pub type DecAppId = NamedObjectId<DecAppType>;
pub type DecApp = NamedObjectBase<DecAppType>;

Caching problem of update mechanism

When ood-daemon checks for updates each time, it first gets the service list version, and if it is inconsistent with local, then it gets the DecApp object of each service in turn, which has a cache design in order to avoid frequent requests to the meta chain, and the current cache time is 4 hours, and the related code is as follows

pub fn new() -> Self {
let meta_client = MetaClient::new_target(MetaMinerTarget::default())
.with_timeout(std::time::Duration::from_secs(60 * 2));
Self {
meta_client,
cache: Mutex::new(None),
service_objects: MetaClientHelperWithObjectCache::new(
std::time::Duration::from_secs(3600 * 4),
16,
),
service_dir_objects: MetaClientHelperWithObjectCache::new(
std::time::Duration::from_secs(3600 * 24 * 7),
16,
),
}
}

So in ood-daemon's view, the list of available versions of a service is fixed for a period of time, and even if the DecApp object of this service is updated during this period of time, it will not be updated to the chain immediately

The current solution

So your problem should be different service, pull to the different versions released before and after, and then the local cache exists, resulting in not timely update to the newly released version, to solve this problem can use the following two methods

1. Wait for a while

Generally speaking, the update check is not a very sensitive operation, and the frequency is low-frequency operation, so the internal cache will be invalidated after waiting for at most 4 hours, thus updating to the latest service DecApp from the chain.

2. Restart ood-daemon

The above cache is only a memory cache, so after restarting ood-daemon, the cache will naturally cease to exist and the update mechanism will work from scratch, pulling the latest version information from the chain

@lurenpluto
Copy link
Member

But this does have some inconsistency problems, consider our service version release scenario:

For a set of service inside a version list, each time a version is released, the DecApp objects of the updated service are serial, if this overall time interval is long, then there may be an intermediate state, in the middle of the release, part of the service DecApp objects are already in effect, resulting in ood- daemon sees the intermediate state from the meta chain

If possible, a group of service DecApp objects should be publish to meta chain at the same time, preferably in the same block, to avoid the ood-daemon seeing the intermediate state. @weiqiushi

@lurenpluto lurenpluto moved this to 💬To Discuss in CYFS-Stack & Services Apr 22, 2023
@weiqiushi
Copy link
Member

This problem only occurs when service_version is configured as something other than "default".

In this configuration, each service is treated as having a separate version line, and they are checked for updates separately. This is also in line with the design principle. Because in the future, each service may be updated independently instead of uniformly as it is now

When service_version is configured as "default", ood-daemon decides the local service version by checking the service list in the chain. When the official service list is published, the OOD service version alignment is ensured and the above situation will not happen

@lurenpluto
Copy link
Member

This problem only occurs when service_version is configured as something other than "default".

In this configuration, each service is treated as having a separate version line, and they are checked for updates separately. This is also in line with the design principle. Because in the future, each service may be updated independently instead of uniformly as it is now

When service_version is configured as "default", ood-daemon decides the local service version by checking the service list in the chain. When the official service list is published, the OOD service version alignment is ensured and the above situation will not happen

In fact, the key to this problem lies in whether the service-list configured by ood-daemon is an explicit version or some wildcard version based on semver rules, such as *, 1.0 and other non-specific versions; for explicit specific versions, when inconsistency of service DecApp occurs, the update cycle of ood-daemon may be failed with error, but it will not update to the inconsistent version; but for rule-based version configuration, it may leave ood-daemon in an "intermediate state" of the local version of service DecApp. So for rule-based version configuration, it is possible that ood-daemon is in an "intermediate state" of services, where some services are new and some other services are still old

So we still need to consider the consistency as much as possible when we publish the DecApp of our service list

@lurenpluto
Copy link
Member

So in the long run, we need to optimize the process of releasing service, we need to ensure that all service object updates on the meta chain are atomic in one release, that is, they need to be done in one block, and multiple service objects need to be packaged into one transaction when they are on the chain, so it may involve The following optimizations and changes are possible

  1. The process of compiling and packaging service by ci
  2. The meta chain tool needs to support multiple object functions for one transaction.

@lurenpluto
Copy link
Member

Relevant optimizations are tracked in the following issue:
#242

@lurenpluto lurenpluto moved this from 💬To Discuss to ✅Done in CYFS-Stack & Services Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working OOD-daemon The OOD-daemon basic service
Projects
Status: Done
Development

No branches or pull requests

3 participants