Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smartswitch Dash PA Validation offload to NPU #1717

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
172 changes: 172 additions & 0 deletions doc/smart-switch/PA-Validation/SmartSwitchPAValidationOffload.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
## SmartSwitch PA Validation NPU Offload

- [About this Manual](#about-this-manual)
- [Definitions/Abbrevations](#definitionsabbrevations)
- [1. Requirements Overview](#1-requirements-overview)
- [1.1 Functional Requirements](#11-functional-requirements)
- [1.2 Scale Requirements](#12-scale-requirements)
- [2 Modules Design](#2-modules-design)
- [2.1 Dash Offload Manager](#21-dash-offload-manager)
- [2.1.1 SONiC GNMI server changes](#211-sonic-gnmi-server-changes)
- [2.2 PA Validation Offload](#22-pa-validation-offload)
- [2.2.1 PA Validation Offload ACL configuration](#221-pa-validation-offload-acl-configuration)
- [2.3 DPU Shut/Restart](#23-dpu-shutrestart)
- [3. Test Plan](#3-test-plan)

###### Revision

| Rev | Date | Author | Change Description |
|:---:|:-----------:|:---------------------:|-----------------------------------|
| 0.1 | 06/10/2024 | Kumaresh Perumal | Initial version |
| 0.2 | 08/21/2024 | Yakiv Huryk | Added DashOffloadManager |


# About this Manual
This document provides general information about offloading PA validation feature to NPU in Smartswitch.

# Definitions/Abbrevations

| | |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing table headers

|--------------------------|------------------------------------------|
| ACL | Access Control List |
| NPU | Network Processing Unit |
| DPU | Data Processing Unit |


# 1. Requirements Overview

## 1.1 Functional Requirements
- Provide a generic infrastructure for offloading DASH functionality to the NPU (DashOffloadManager).
- DashOffloadManager should listen to the DASH configuration and either intercept it for offloading, or forward it to the DPU.
- Provide PA validation support in NPU when this feature is not supported in DPU pipeline model.
- Use existing ACL tables and ACL rules design in NPU.

## 1.2 Scale Requirements
PA Validation configuration:
* 4096 PA Validation entries

Please refer to [SONiC-DASH HLD](https://github.com/sonic-net/SONiC/blob/master/doc/dash/dash-sonic-hld.md#3213-underlay-src-ip-pa-validation) for more details regarding the PA Validation configuration requirements

ACL:
- One ACL table per each DPU
- 4160 ACL rules:
- 4096 ACL forward rules (one per each PA Validation address)
- 64 ACL drop rules (one per each unique VNI in the configuration)

# 2 Modules Design

## 2.1 Dash Offload Manager
The new orchagent application DashOffloadManager will be responsible for DASH offloading logic. It will collect all the needed information for offloading and perform all the relevant configurations.
To get the DASH configuration that should be offloaded, the DashOffloadManager will act as a transparent ZMQ proxy between the GNMI server and the DPU swss, forwarding all the configuration and intercepting the tables that should be offloaded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the ZMQ proxy necessary? Could you subscribe to DPU_APPL_DB instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to have control over what is sent to the DPU, which is a more flexible tool for implementing different kinds of offload-related functionality.

In case of PA validation offload, we don't forward the PA Validation table entries to the DPU (since DPU has no use of it). This saves some NPU<->DPU bandwidth and allows to keep the DPU side simple (no entries - nothing to create).

In the future, we can extend this infrastructure also to alter the content of the configuration (if such a need arises for some other offload feature)

Copy link
Contributor

@ganglyu ganglyu Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GNMI client-->GMMI server-->ZMQ-->DPU orchagent-->NPU DPU_APPL_STATE_DB

The DPU orchagent should update the results to the DPU_APPL_STATE_DB on the NPU. If we don't forward the PA Validation table entries to the DPU, the client won't know if this configuration failed or was intercepted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the flow described here #1759 ?
As I understand, the feedback design is not finalized yet. Once it is, I'll have to add the same logic to the offload flow to also fill the result into the DPU_APPL_STATE_DB for the offloaded (intercepted) PA validation entries. Is it ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please include this in the design?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having the same concern here.... This seem to be complicated to implement, because of a few things:

  1. The DASH offload manager or the NPU side DASH orchs need to provide the exact same feedback loop as the swss. If anything changed in swss, they need to be changed as well, which can be easily missed and causing problem.
  2. In order to support independent DPU upgrades, each DPU will needs to have its own DASH offload manager and all the DASH orchs in the NPU side.
  3. Dependency and object handling can also be a problem. This solution is trying to provide a generic way to handle all the DASH object offloading in the future, but I feel it doesn't really do the job. Explicit PA validation rule might be the simpliest case, where the only thing that we needs to do is to redirect the rules into the NPU side. However, other DASH objects can have dependencies, e.g., Implicit PA validation rules are coming from VNET + CA-PA mappings. In this case, we cannot simply redirect the rules into the NPU side, but have to copy it, because they are also used in the outbound pipeline.

Overall, I feel Gang is correct. The other way that Gang proposed there is actually much more cleaner and maintainable. All we need is just a if case in the swss, and every other things can be reused, such as feedback loops.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ganglyu , please let us know your thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this design is necessary. If the DPU doesn't need the PA validation table and only the NPU requires it, we can configure this table directly for the NPU.

To simplify the management of the DPU offload and achieve optimal performance, every DPU is handled by a separate instance of a ZMQ Proxy (pair of ZMQ Server and Client)

<img src="images/DashOffloadManager.svg">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is after the gnmi splitter, will be better to make it more clear how it works with the independent dpu upgrade changes.


Once the offload is required, the DashOffloadManager will start designated orch (e.g. PAValidationOffloadOrch) that will subscribe to the configuration and do the offload logic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the data being stored in NPU side? is it in NPU database or DPU database? If in NPU database, the schema will need to be changed, but I think it is not covered in the spec.


<img src="images/DashOffloadManagerWithConsumer.svg">

The Dash Offload Manager is disabled by default and only enabled for specific platforms that require its functionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have any logic to update the gnmi server to point to the proxy or the dpu?


### 2.1.1 SONiC GNMI server changes
A new parameter `zmq_dpu_proxy_address_base` is added to the telemetry.go to enable the GNMI Server -> DASH Offload Manager ZMQ connection. If specified, the GNMI server will use `zmq_dpu_proxy_address_base + dpu_index:zmq_port` as a destination for DPU ZMQ connection. For example, on the system with 2 DPUs and base address 127.0.0.10, the DASH Offload Manager's ZMQ server runs on 127.0.0.10:8100 for DPU0 and on 127.0.0.11:8100 for DPU1.
liuh-80 marked this conversation as resolved.
Show resolved Hide resolved

## 2.2 PA Validation Offload

A new orchagent (PaValidationOffloadOrch) is responsible for offloading the PA_VALIDATION table entries. Once started by DashOffloadManager, it subscribes to the PA_VALIDATION table and maps the PA validation entries into the ACL configuration.

### 2.2.1 PA Validation Offload ACL configuration

The offloading is achieved by using a User-Defined EGRESS ACL table bound to the NPU<->DPU port. The table is created/managed per DPU.

```
{
"ACL_TABLE_TYPE": {
"DASH_PA_VALIDATION": {
"MATCHES": [
"TUNNEL_VNI",
"SRC_IP",
"SRC_IPV6"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be better to add the destination IP in here as well. The reason is because SmartSwitch lives in T1 and can receive other traffic in the same VNET, which is not sent to the DPU, but to other VMs. Adding the SmartSwitch data plane VIP as destination IP will be safer and more future-proof.

],
"ACTIONS": [
"PACKET_ACTION"
],
"BIND_POINTS": [
"PORT"
]
}
}
}
```

Please refer to https://github.com/sonic-net/SONiC/blob/master/doc/acl/ACL-Table-Type-HLD.md for more details regarding User-Defined ACL tables.

The ACL table configuration:

```
{
"ACL_TABLE": {
"DASH_PA_VALIDATION_DPU0": {
"STAGE": "EGRESS",
"TYPE": "DASH_PA_VALIDATION",
"PORTS": ["Ethernet0"]
}
}
}
```

Each PA Validation entry is translated into the following set of rules:

1) For every address in pa_validation.addresses:
```
{
"ACL_RULE": {
"DASH_PA_VALIDATION_DPU0|RULE_VNI_{pa_validation.vni}_{idx}": {
"PRIORITY": "10",
"PACKET_ACTION": "ACCEPT",
"SRC_IP": "{address}/32",
"TUNNEL_VNI": {pa_validation.vni}
}
}
}
```

2) Single drop rule with the lower priority:
```
{
"ACL_RULE": {
"DASH_PA_VALIDATION_DPU0|RULE_VNI_{pa_validation.vni}_DROP": {
"PRIORITY": "1",
"PACKET_ACTION": "DROP",
"TUNNEL_VNI": {pa_validation.vni}
}
}
}
```

<img src="images/DashOffloadAcl.svg">

### 2.2.1 PA Validation Offload GNMI feedback
To preserve the GNMI feedback behavior, the offload logic must also create an entry in the DPU's APPL_STATE_DB.
For each PA validation processed, the PaValidationOffloadOrch creates the following entry:

```
DASH_PA_VALIDATION_TABLE:{{vni}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean the NPU side PAValidationOffloadOrch will write into the DPU database?

"result": {{result}}
```

The result is 0 for success, or > 0 for error code

Please refer to https://github.com/sonic-net/SONiC/pull/1759 for more details regarding GNMI feedback requirements and behavior.

## 2.3 DPU Shut/Restart
When DPU goes down/restarts, the ACL configuration should be cleaned. It's done by the Dash Offload Manager which listens to the ChassisStateDB DPU_STATE Table. When it detects that the DPU is down (dpu_control_plane_state is down), the PaValidationOffloadOrch is deinitialized, leading to ACL configuration cleanup and ZMQ proxy subscription removal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might also need to cover the DPU upgrade case, because when chassis db is updated, this orch might not be running at all.

@ganglyu to confirm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, we need to upgrade this orch when we upgrade the DPU.


# 3. Test Plan

| Test ID | Testcase | Expectation |
|:-----------------:|:----------------------------------|:--------------------|
|1 | PA_VALIDATION entries pushed to DPU| Check Egress ACL_TABLE attached to the DPU port and contains the rules.|
|2 | Add more PA_VALIDATION entries | The ACL rules are added to the Egress ACL table. |
|3 | Delete PA_VALIDATION entries | Deletion of ACL rules from NPU. |
|4 | DPU shutdown | ACL configuration is cleaned. |
Loading