Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve gnmi full config update #1840

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 36 additions & 12 deletions doc/mgmt/gnmi/SONiC_GNMI_Server_Interface_Design.md
Original file line number Diff line number Diff line change
Expand Up @@ -441,18 +441,42 @@ For incremental configuration, 'config apply-patch' should not restart gNMI, bgp

##### 1.2.1.11.2 Full Configurations

For full configuration, there’re 2 possible solutions:
* gNMI server needs to send response at first, and then invoke 'config reload'.

<img src="images/full_rpc.svg" alt="full-gnmi" width="800px"/>

* Use gNMI request and gNOI request together to implement full configuration update.

<img src="images/full_rpc_gnoi.svg" alt="full-gnmi-gnoi" width="800px"/>

The full configuration request will be overwritten by subsequent full configuration request or incremental configuration request.

<img src="images/mixed requests.svg" alt="overwritten-config" width="800px"/>
We will use single gnmi request to support full configuration update.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

during the full configuration update, do we block any config update request from other interfaces like REST, CLI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not block other configuration update requests. All configuration updates made before the 'config reload' will be overwritten, but those made after the 'config reload' will still function.

1. The host service will perform YANG validation and return an error if it fails.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if the yang validation address the validation based on the state data, like TCAM ACL limit? If not then yang validation will succeed, but config reload may fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Currently, SONiC yang does not check the ACL limit, but we can add this limitation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cases where we need to enforce custom validation rules that go beyond the constraints of the Yang model, how does the current design plan to handle these? translib based infrastructure provides this mechanism.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can continue using the current translib infrastructure, and this change will not impact it. We are utilizing the origin field of the gnmi request, and only those requests with the 'sonic-db' origin field can use this gnmi full config update.

2. The host service will mask the gNMI service, preventing it from restarting after a config reload.
3. The host service will execute a config reload using the input configuration.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any chance that switch will not be in desired state after config reload, if so will it be handled here? like rollback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we will use "config reload -f" to rollback

4. The host service will execute a config save if step 3 is successful.
5. The host service will perform a config reload to recover if step 3 fails.
6. The host service will unmask the gNMI service.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the gNMI service will not reload in this whole flow.. What if the new config includes changes to the management network or gNMI server settings (port, auth mode, certs etc)? AFAIK gNMI server requires a restart to apply these changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is listed as a limitation below.. Is there any gNOI interface for clients to force restart the server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use gnoi reboot interface to restart device and apply all the configurations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think reboot will be not so practical. There will be a significant downtime due to config reload already. Reboot requires another downtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we don't have a good solution at the moment. Let's keep this limitation, as gNMI can't be used to change the management network or gNMI server settings.


```mermaid
sequenceDiagram
participant client
box SONiC
participant gnmi server
participant host service
end
client->>gnmi server: update full configuration
gnmi server->>host service: run yang validation
host service->>host service: run yang validation
host service-->>gnmi server: result
gnmi server->>host service: run config reload with input config
host service->>host service: mask gnmi service
host service->>host service: run config reload with input config
host service->>host service: unmask gnmi service
host service-->>gnmi server: result
gnmi server->>host service: if config reload suceeded, run config save
host service->>host service: run config save
host service-->>gnmi server: result
gnmi server->>host service: if config reload failed, run config reload to recover
host service->>host service: mask gnmi service
host service->>host service: run config reload -f
host service->>host service: unmask gnmi service
host service-->>gnmi server: result
gnmi server-->>client: result
```

One limitation is that we can't restart the gNMI server. Therefore, if we update the configuration for the gNMI server, the changes will not take effect. And we can use gNOI reboot request to apply the whole configurations.

#### 1.2.1.12 Backward Compatibility

Expand Down