-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smartswitch Dash PA Validation offload to NPU #1717
base: master
Are you sure you want to change the base?
Changes from all commits
76f56f2
83e891a
8401fd1
85e130b
6291a73
6f6cda0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,172 @@ | ||
## SmartSwitch PA Validation NPU Offload | ||
|
||
- [About this Manual](#about-this-manual) | ||
- [Definitions/Abbrevations](#definitionsabbrevations) | ||
- [1. Requirements Overview](#1-requirements-overview) | ||
- [1.1 Functional Requirements](#11-functional-requirements) | ||
- [1.2 Scale Requirements](#12-scale-requirements) | ||
- [2 Modules Design](#2-modules-design) | ||
- [2.1 Dash Offload Manager](#21-dash-offload-manager) | ||
- [2.1.1 SONiC GNMI server changes](#211-sonic-gnmi-server-changes) | ||
- [2.2 PA Validation Offload](#22-pa-validation-offload) | ||
- [2.2.1 PA Validation Offload ACL configuration](#221-pa-validation-offload-acl-configuration) | ||
- [2.3 DPU Shut/Restart](#23-dpu-shutrestart) | ||
- [3. Test Plan](#3-test-plan) | ||
|
||
###### Revision | ||
|
||
| Rev | Date | Author | Change Description | | ||
|:---:|:-----------:|:---------------------:|-----------------------------------| | ||
| 0.1 | 06/10/2024 | Kumaresh Perumal | Initial version | | ||
| 0.2 | 08/21/2024 | Yakiv Huryk | Added DashOffloadManager | | ||
|
||
|
||
# About this Manual | ||
This document provides general information about offloading PA validation feature to NPU in Smartswitch. | ||
|
||
# Definitions/Abbrevations | ||
|
||
| | | | ||
|--------------------------|------------------------------------------| | ||
| ACL | Access Control List | | ||
| NPU | Network Processing Unit | | ||
| DPU | Data Processing Unit | | ||
|
||
|
||
# 1. Requirements Overview | ||
|
||
## 1.1 Functional Requirements | ||
- Provide a generic infrastructure for offloading DASH functionality to the NPU (DashOffloadManager). | ||
- DashOffloadManager should listen to the DASH configuration and either intercept it for offloading, or forward it to the DPU. | ||
- Provide PA validation support in NPU when this feature is not supported in DPU pipeline model. | ||
- Use existing ACL tables and ACL rules design in NPU. | ||
|
||
## 1.2 Scale Requirements | ||
PA Validation configuration: | ||
* 4096 PA Validation entries | ||
|
||
Please refer to [SONiC-DASH HLD](https://github.com/sonic-net/SONiC/blob/master/doc/dash/dash-sonic-hld.md#3213-underlay-src-ip-pa-validation) for more details regarding the PA Validation configuration requirements | ||
|
||
ACL: | ||
- One ACL table per each DPU | ||
- 4160 ACL rules: | ||
- 4096 ACL forward rules (one per each PA Validation address) | ||
- 64 ACL drop rules (one per each unique VNI in the configuration) | ||
|
||
# 2 Modules Design | ||
|
||
## 2.1 Dash Offload Manager | ||
The new orchagent application DashOffloadManager will be responsible for DASH offloading logic. It will collect all the needed information for offloading and perform all the relevant configurations. | ||
To get the DASH configuration that should be offloaded, the DashOffloadManager will act as a transparent ZMQ proxy between the GNMI server and the DPU swss, forwarding all the configuration and intercepting the tables that should be offloaded. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is the ZMQ proxy necessary? Could you subscribe to DPU_APPL_DB instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea is to have control over what is sent to the DPU, which is a more flexible tool for implementing different kinds of offload-related functionality. In case of PA validation offload, we don't forward the PA Validation table entries to the DPU (since DPU has no use of it). This saves some NPU<->DPU bandwidth and allows to keep the DPU side simple (no entries - nothing to create). In the future, we can extend this infrastructure also to alter the content of the configuration (if such a need arises for some other offload feature) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. GNMI client-->GMMI server-->ZMQ-->DPU orchagent-->NPU DPU_APPL_STATE_DB The DPU orchagent should update the results to the DPU_APPL_STATE_DB on the NPU. If we don't forward the PA Validation table entries to the DPU, the client won't know if this configuration failed or was intercepted. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you referring to the flow described here #1759 ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you please include this in the design? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am having the same concern here.... This seem to be complicated to implement, because of a few things:
Overall, I feel Gang is correct. The other way that Gang proposed there is actually much more cleaner and maintainable. All we need is just a if case in the swss, and every other things can be reused, such as feedback loops. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ganglyu , please let us know your thought. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if this design is necessary. If the DPU doesn't need the PA validation table and only the NPU requires it, we can configure this table directly for the NPU. |
||
To simplify the management of the DPU offload and achieve optimal performance, every DPU is handled by a separate instance of a ZMQ Proxy (pair of ZMQ Server and Client) | ||
|
||
<img src="images/DashOffloadManager.svg"> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume this is after the gnmi splitter, will be better to make it more clear how it works with the independent dpu upgrade changes. |
||
|
||
Once the offload is required, the DashOffloadManager will start designated orch (e.g. PAValidationOffloadOrch) that will subscribe to the configuration and do the offload logic. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. where is the data being stored in NPU side? is it in NPU database or DPU database? If in NPU database, the schema will need to be changed, but I think it is not covered in the spec. |
||
|
||
<img src="images/DashOffloadManagerWithConsumer.svg"> | ||
|
||
The Dash Offload Manager is disabled by default and only enabled for specific platforms that require its functionality. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we have any logic to update the gnmi server to point to the proxy or the dpu? |
||
|
||
### 2.1.1 SONiC GNMI server changes | ||
A new parameter `zmq_dpu_proxy_address_base` is added to the telemetry.go to enable the GNMI Server -> DASH Offload Manager ZMQ connection. If specified, the GNMI server will use `zmq_dpu_proxy_address_base + dpu_index:zmq_port` as a destination for DPU ZMQ connection. For example, on the system with 2 DPUs and base address 127.0.0.10, the DASH Offload Manager's ZMQ server runs on 127.0.0.10:8100 for DPU0 and on 127.0.0.11:8100 for DPU1. | ||
liuh-80 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## 2.2 PA Validation Offload | ||
|
||
A new orchagent (PaValidationOffloadOrch) is responsible for offloading the PA_VALIDATION table entries. Once started by DashOffloadManager, it subscribes to the PA_VALIDATION table and maps the PA validation entries into the ACL configuration. | ||
|
||
### 2.2.1 PA Validation Offload ACL configuration | ||
|
||
The offloading is achieved by using a User-Defined EGRESS ACL table bound to the NPU<->DPU port. The table is created/managed per DPU. | ||
|
||
``` | ||
{ | ||
"ACL_TABLE_TYPE": { | ||
"DASH_PA_VALIDATION": { | ||
"MATCHES": [ | ||
"TUNNEL_VNI", | ||
"SRC_IP", | ||
"SRC_IPV6" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it will be better to add the destination IP in here as well. The reason is because SmartSwitch lives in T1 and can receive other traffic in the same VNET, which is not sent to the DPU, but to other VMs. Adding the SmartSwitch data plane VIP as destination IP will be safer and more future-proof. |
||
], | ||
"ACTIONS": [ | ||
"PACKET_ACTION" | ||
], | ||
"BIND_POINTS": [ | ||
"PORT" | ||
] | ||
} | ||
} | ||
} | ||
``` | ||
|
||
Please refer to https://github.com/sonic-net/SONiC/blob/master/doc/acl/ACL-Table-Type-HLD.md for more details regarding User-Defined ACL tables. | ||
|
||
The ACL table configuration: | ||
|
||
``` | ||
{ | ||
"ACL_TABLE": { | ||
"DASH_PA_VALIDATION_DPU0": { | ||
"STAGE": "EGRESS", | ||
"TYPE": "DASH_PA_VALIDATION", | ||
"PORTS": ["Ethernet0"] | ||
} | ||
} | ||
} | ||
``` | ||
|
||
Each PA Validation entry is translated into the following set of rules: | ||
|
||
1) For every address in pa_validation.addresses: | ||
``` | ||
{ | ||
"ACL_RULE": { | ||
"DASH_PA_VALIDATION_DPU0|RULE_VNI_{pa_validation.vni}_{idx}": { | ||
"PRIORITY": "10", | ||
"PACKET_ACTION": "ACCEPT", | ||
"SRC_IP": "{address}/32", | ||
"TUNNEL_VNI": {pa_validation.vni} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
2) Single drop rule with the lower priority: | ||
``` | ||
{ | ||
"ACL_RULE": { | ||
"DASH_PA_VALIDATION_DPU0|RULE_VNI_{pa_validation.vni}_DROP": { | ||
"PRIORITY": "1", | ||
"PACKET_ACTION": "DROP", | ||
"TUNNEL_VNI": {pa_validation.vni} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
<img src="images/DashOffloadAcl.svg"> | ||
|
||
### 2.2.1 PA Validation Offload GNMI feedback | ||
To preserve the GNMI feedback behavior, the offload logic must also create an entry in the DPU's APPL_STATE_DB. | ||
For each PA validation processed, the PaValidationOffloadOrch creates the following entry: | ||
|
||
``` | ||
DASH_PA_VALIDATION_TABLE:{{vni}} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does this mean the NPU side PAValidationOffloadOrch will write into the DPU database? |
||
"result": {{result}} | ||
``` | ||
|
||
The result is 0 for success, or > 0 for error code | ||
|
||
Please refer to https://github.com/sonic-net/SONiC/pull/1759 for more details regarding GNMI feedback requirements and behavior. | ||
|
||
## 2.3 DPU Shut/Restart | ||
When DPU goes down/restarts, the ACL configuration should be cleaned. It's done by the Dash Offload Manager which listens to the ChassisStateDB DPU_STATE Table. When it detects that the DPU is down (dpu_control_plane_state is down), the PaValidationOffloadOrch is deinitialized, leading to ACL configuration cleanup and ZMQ proxy subscription removal. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we might also need to cover the DPU upgrade case, because when chassis db is updated, this orch might not be running at all. @ganglyu to confirm. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Theoretically, we need to upgrade this orch when we upgrade the DPU. |
||
|
||
# 3. Test Plan | ||
|
||
| Test ID | Testcase | Expectation | | ||
|:-----------------:|:----------------------------------|:--------------------| | ||
|1 | PA_VALIDATION entries pushed to DPU| Check Egress ACL_TABLE attached to the DPU port and contains the rules.| | ||
|2 | Add more PA_VALIDATION entries | The ACL rules are added to the Egress ACL table. | | ||
|3 | Delete PA_VALIDATION entries | Deletion of ACL rules from NPU. | | ||
|4 | DPU shutdown | ACL configuration is cleaned. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing table headers