Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add manual local Thread commission test #48

Merged
merged 5 commits into from
Apr 22, 2024

Conversation

MonicaisHer
Copy link
Contributor

@MonicaisHer MonicaisHer commented Feb 27, 2024

Setup

  • Local Machine:

    • chip-tool snap
    • OTBR snap with Thread network
  • Remote Machine:

    • OTBR snap with otbr-agent
    • chip-all-clusters-minimal-app

Testing

$ cd chip-tool-snap/manual_thread_tests
$ LOCAL_INFRA_IF="eno1" \
   REMOTE_INFRA_IF="eth0" REMOTE_USER="ubuntu" REMOTE_PASSWORD="abcdef" \
   REMOTE_IP="192.168.178.95" \
   go test -v -failfast -count 1
2024/03/19 15:41:10 [CLEAN]
2024/03/19 15:41:10 [exec] sudo snap remove --purge chip-tool
2024/03/19 15:41:10 [stderr] snap "chip-tool" is not installed
2024/03/19 15:41:10 [SETUP]
2024/03/19 15:41:10 [exec] sudo snap install chip-tool --channel=latest/edge
2024/03/19 15:41:13 [stdout] chip-tool (edge) master+snap from Canonical IoT Labs installed
2024/03/19 15:41:13 [exec] sudo snap connect chip-tool:avahi-observe 
2024/03/19 15:41:13 [exec] sudo snap connect chip-tool:bluez 
2024/03/19 15:41:14 [exec] sudo snap connect chip-tool:process-control 
=== RUN   TestAllClustersAppThread
    snap.go:62: [exec] sudo snap remove --purge openthread-border-router
    exec.go:140: [stderr] snap "openthread-border-router" is not installed
    snap.go:28: [exec] sudo snap install openthread-border-router --channel=latest/edge
    exec.go:140: [stdout] openthread-border-router (edge) thread-reference-20230119+snap from Canonical IoT Labs installed
2024/03/19 15:41:18 [exec] sudo snap connect openthread-border-router:avahi-control 
2024/03/19 15:41:18 [exec] sudo snap connect openthread-border-router:firewall-control 
2024/03/19 15:41:19 [exec] sudo snap connect openthread-border-router:raw-usb 
2024/03/19 15:41:19 [exec] sudo snap connect openthread-border-router:network-control 
2024/03/19 15:41:20 [exec] sudo snap connect openthread-border-router:bluetooth-control 
2024/03/19 15:41:21 [exec] sudo snap connect openthread-border-router:bluez 
2024/03/19 15:41:21 [exec] sudo snap set openthread-border-router infra-if='eno1'
    snap.go:161: [exec] sudo snap start --enable openthread-border-router
    exec.go:140: [stdout] Started.
    config.go:59: Retry 1/10: Waiting for expected content in logs: Start Thread Border Agent: OK
    snap.go:138: [exec] sudo journalctl --since "2024-03-19 15:41:21" --no-pager | grep "openthread-border-router"|| true
    config.go:63: Found expected content in logs: Start Thread Border Agent: OK
    thread_commission_test.go:28: [exec] sudo openthread-border-router.ot-ctl dataset init new
    thread_commission_test.go:28: [exec] sudo openthread-border-router.ot-ctl dataset commit active
    thread_commission_test.go:28: [exec] sudo openthread-border-router.ot-ctl ifconfig up
    thread_commission_test.go:28: [exec] sudo openthread-border-router.ot-ctl thread start
    config.go:59: Retry 1/10: Waiting for expected content in logs: Thread Network
    snap.go:138: [exec] sudo journalctl --since "2024-03-19 15:41:21" --no-pager | grep "openthread-border-router"|| true
    config.go:59: Retry 2/10: Waiting for expected content in logs: Thread Network
    snap.go:138: [exec] sudo journalctl --since "2024-03-19 15:41:21" --no-pager | grep "openthread-border-router"|| true
    config.go:59: Retry 3/10: Waiting for expected content in logs: Thread Network
    snap.go:138: [exec] sudo journalctl --since "2024-03-19 15:41:21" --no-pager | grep "openthread-border-router"|| true
    config.go:59: Retry 4/10: Waiting for expected content in logs: Thread Network
    snap.go:138: [exec] sudo journalctl --since "2024-03-19 15:41:21" --no-pager | grep "openthread-border-router"|| true
    config.go:59: Retry 5/10: Waiting for expected content in logs: Thread Network
    snap.go:138: [exec] sudo journalctl --since "2024-03-19 15:41:21" --no-pager | grep "openthread-border-router"|| true
    config.go:59: Retry 6/10: Waiting for expected content in logs: Thread Network
    snap.go:138: [exec] sudo journalctl --since "2024-03-19 15:41:21" --no-pager | grep "openthread-border-router"|| true
    config.go:59: Retry 7/10: Waiting for expected content in logs: Thread Network
    snap.go:138: [exec] sudo journalctl --since "2024-03-19 15:41:21" --no-pager | grep "openthread-border-router"|| true
    config.go:63: Found expected content in logs: Thread Network
    thread_commission_test.go:29: [exec] sudo openthread-border-router.ot-ctl dataset active -x | awk '{print $NF}' | grep --invert-match "Done"
    thread_commission_test.go:32: SSH connection to 192.168.178.96 established successfully
    thread_commission_test.go:32: Executed the command remotely: sudo snap remove --purge openthread-border-router
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap install openthread-border-router --edge
    thread_commission_test.go:32: Output: openthread-border-router (edge) thread-reference-20230119+snap from Canonical IoT Labs installed
    thread_commission_test.go:32: Executed the command remotely: sudo snap set openthread-border-router infra-if='eth0'
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect openthread-border-router:avahi-control
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect openthread-border-router:firewall-control
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect openthread-border-router:raw-usb
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect openthread-border-router:network-control
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect openthread-border-router:bluetooth-control
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect openthread-border-router:bluez
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap start openthread-border-router
    thread_commission_test.go:32: Output: Started.
    thread_commission_test.go:32: Executed the command remotely: sudo apt install bluez
    thread_commission_test.go:32: Output: Reading package lists...
        Building dependency tree...
        Reading state information...
        bluez is already the newest version (5.53-0ubuntu3.7).
        0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
    thread_commission_test.go:32: Executed the command remotely: sudo snap remove --purge matter-all-clusters-app
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap install matter-all-clusters-app --edge
    thread_commission_test.go:32: Output: matter-all-clusters-app (edge) fb786aa3+snap from Canonical IoT Labs installed
    thread_commission_test.go:32: Executed the command remotely: sudo snap set matter-all-clusters-app args='--thread'
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect matter-all-clusters-app:avahi-control
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect matter-all-clusters-app:bluez
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap connect matter-all-clusters-app:otbr-dbus-wpan0 openthread-border-router:dbus-wpan0
    thread_commission_test.go:32: Output: 
    thread_commission_test.go:32: Executed the command remotely: sudo snap start matter-all-clusters-app
    thread_commission_test.go:32: Output: Started.
    thread_commission_test.go:32: Retry 1/10: Waiting for expected content in logs: CHIP minimal mDNS started advertising
    thread_commission_test.go:32: sudo journalctl --since "2024-03-19 14:42:57" --no-pager | grep "matter-all-clusters-app"|| true
    thread_commission_test.go:32: Found expected content in logs: CHIP minimal mDNS started advertising
    thread_commission_test.go:32: Running matter all clusters app
=== RUN   TestAllClustersAppThread/Commission
    thread_commission_test.go:35: [exec] sudo chip-tool pairing ble-thread 110 hex:0e080000000000010000000300001a35060004001fffe0020857bfeea5c777e82b0708fdf4062767c156b50510218cb8b1dd970f50c47973411a4379b2030f4f70656e5468726561642d39306636010290f60410fdd7252a272574598c8b0538235bf2040c0402a0f7f8 20202021 3840 2>&1
    thread_commission_test.go:35: ...
        [1710859442.285269][41236:41272] CHIP:BLE: New device scanned: 00:1A:7D:DA:71:15
        [1710859442.285279][41236:41272] CHIP:BLE: Device discriminator match. Attempting to connect.
        [1710859442.287302][41236:41272] CHIP:BLE: ChipDeviceScanner has stopped scanning!
        [1710859462.286945][41236:41273] CHIP:DL: HandlePlatformSpecificBLEEvent 16386
        [1710859462.286957][41236:41273] CHIP:DIS: Closing all BLE connections
        [1710859462.286983][41236:41273] CHIP:TOO: Pairing Failure: src/platform/Linux/BLEManagerImpl.cpp:83: CHIP Error 0x00000032: Timeout
        [1710859462.287046][41236:41272] CHIP:DL: FAIL: ConnectDevice: Operation was cancelled (19)
        [1710859462.287061][41236:41272] CHIP:BLE: Device connection failed: src/platform/Linux/bluez/BluezEndpoint.cpp:742: CHIP Error 0x000000AC: Internal error
        [1710859462.287101][41236:41273] CHIP:DL: HandlePlatformSpecificBLEEvent 16386
        [1710859462.287183][41236:41236] CHIP:CTL: Shutting down the commissioner
        [1710859462.287252][41236:41236] CHIP:CTL: Shutting down the controller
        [1710859462.287275][41236:41236] CHIP:IN: Expiring all sessions for fabric 0x1!!
        [1710859462.287278][41236:41236] CHIP:FP: Forgetting fabric 0x1
        [1710859462.287288][41236:41236] CHIP:TS: Pending Last Known Good Time: 2023-10-14T01:16:48
        [1710859462.287350][41236:41236] CHIP:TS: Previous Last Known Good Time: 2023-10-14T01:16:48
        [1710859462.287353][41236:41236] CHIP:TS: Reverted Last Known Good Time to previous value
        [1710859462.287366][41236:41236] CHIP:CTL: Shutting down the commissioner
        [1710859462.287380][41236:41236] CHIP:CTL: Shutting down the controller
        [1710859462.287382][41236:41236] CHIP:CTL: Shutting down the System State, this will teardown the CHIP Stack
        [1710859462.287423][41236:41236] CHIP:DMG: All ReadHandler-s are clean, clear GlobalDirtySet
        [1710859462.287442][41236:41236] CHIP:FP: Shutting down FabricTable
        [1710859462.287445][41236:41236] CHIP:TS: Pending Last Known Good Time: 2023-10-14T01:16:48
        [1710859462.287472][41236:41236] CHIP:TS: Previous Last Known Good Time: 2023-10-14T01:16:48
        [1710859462.287474][41236:41236] CHIP:TS: Reverted Last Known Good Time to previous value
        [1710859462.287565][41236:41236] CHIP:DL: writing settings to file (/mnt/chip_counters.ini-y8j6Vc)
        [1710859462.287708][41236:41236] CHIP:DL: renamed tmp file to file (/mnt/chip_counters.ini)
        [1710859462.287718][41236:41236] CHIP:DL: NVS set: chip-counters/total-operational-hours = 0 (0x0)
        [1710859462.287721][41236:41236] CHIP:DL: Inet Layer shutdown
        [1710859462.287723][41236:41236] CHIP:DL: BLE shutdown
        [1710859462.288986][41236:41236] CHIP:DL: System Layer shutdown
        [1710859462.289219][41236:41236] CHIP:TOO: Run command failure: src/platform/Linux/BLEManagerImpl.cpp:83: CHIP Error 0x00000032: Timeout
    thread_commission_test.go:35: exit status 1
=== NAME  TestAllClustersAppThread
    snap.go:62: [exec] sudo snap remove --purge openthread-border-router
    exec.go:140: [stdout] openthread-border-router removed
    thread_commission_test.go:20: Executed the command remotely: sudo snap remove --purge matter-all-clusters-app
    thread_commission_test.go:20: Output: matter-all-clusters-app removed
    thread_commission_test.go:20: Executed the command remotely: sudo snap remove --purge openthread-border-router
    thread_commission_test.go:20: Output: openthread-border-router removed
--- FAIL: TestAllClustersAppThread (303.85s)
    --- FAIL: TestAllClustersAppThread/Commission (23.02s)
FAIL
2024/03/19 15:46:18 [TEARDOWN]
2024/03/19 15:46:18 Removing installed snap: true
2024/03/19 15:46:18 [exec] sudo snap remove --purge chip-tool
2024/03/19 15:46:21 [stdout] chip-tool removed
exit status 1
FAIL    chip-tool-snap-tests    311.062s

@MonicaisHer MonicaisHer changed the title Add Thread commission test Add manual local Thread commission test Feb 27, 2024
@MonicaisHer
Copy link
Contributor Author

MonicaisHer commented Feb 28, 2024

When manually setting up these steps, the device chip-all-cluster-minial-app could be successfully commissioned. However, when running the same steps in the above automated go test, the device can't be seen during the commissioning process. At the same time, on the device side, it keeps showing error:

CHIP:DL: FAIL: device is not connected

Here is the full log.

Update: When the chip-tool is installed on the same machine as the OTBR Thread Network, the commissioning process will succeed. Currently, the setup is such that the chip-tool is not installed on the same machine as the OTBR Thread network.

@MonicaisHer
Copy link
Contributor Author

MonicaisHer commented Feb 29, 2024

Error from native built chip-all-cluster-minimal-app during the commissioning process:

$ ./chip-all-clusters-minimal-app-commit-1536ca2 --thread
CHIP:DL: openthread: _SetThreadEnabled calling Reset failed: 
GDBus.Error:org.freedesktop.DBus.Error.AccessDenied: 
An AppArmor policy prevents this sender from sending this message to this recipient; 
type="method_call", sender=":1.133" 
(uid=1000 pid=12380 comm="./chip-all-clusters-minimal-app-commit-1536ca2 --t" 
label="snap.code.code (complain)") 
interface="io.openthread.BorderRouter" member="Reset" error name="(unset)" 
requested_reply="0" destination=":1.112" 
(uid=0 pid=8379 comm="/snap/openthread-border-router/49/bin/otbr-agent -" 
label="snap.openthread-border-router.otbr-agent (enforce)")
$ journalctl -f | grep apparmor
msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/io/openthread/BorderRouter/wpan0" 
interface="org.freedesktop.DBus.Properties" member="GetAll" name=":1.121" mask="receive" pid=9648 
label="snap.openthread-border-router.otbr-agent" peer_pid=9834 peer_label="snap.code.code"

Update: The commission process was successful, and the above DBus error disappeared after rebooting both machines and performing a clean reinstallation of all components. The above error might have been due to an unknown setup environment issue.

Copy link
Member

@farshidtz farshidtz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the core pieces are in place, but there are some logical and programming issues. Please see the comments.


// Connect to the remote device
client := connectToRemoteDevice(t)
defer client.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client will not close on error because deferred statements don't get called on fatal errors that exit the application and don't return to the caller. Need to use t.Cleanup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. All defer statements have to be moved to t.Cleanup now.

Comment on lines 64 to 68
session, err := client.NewSession()
if err != nil {
return "", fmt.Errorf("failed to create session: %v", err)
}
defer session.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opening and closing an SSH session for every command isn't ideal. It should be possible to manage all the remote operations for this test within a single SSH session.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code has been updated to now open an SSH session only once.

Comment on lines 31 to 35
// Set remote configuration from environment variables
setRemoteConfigFromEnv()

// Connect to the remote device
client := connectToRemoteDevice(t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first function reads environment variables and stores those values in global variables. The second function reads those values from global variables. Another global variable is read by installOTBR function.

This coding pattern of loading into global variables and then reading them once elsewhere isn't recommended. True, it makes it easy to write code, but it makes the code equally less maintainable, error-prone, and harder to follow. What if another function modifies the values, or reads them before values are set?

}

// installOTBR installs OTBR and configures Thread network
func installOTBR(client *ssh.Client, t *testing.T) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name doesn't reflect the functionality. The installation is done remotely.

Copy link
Contributor Author

@MonicaisHer MonicaisHer Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the OTBR installation has been moved to:

func RemoteDeviceSetup(t *testing.T) {
    deployOTBRAgentOnRemoteDevice(t)
}

commands := []string{
"sudo snap remove --purge openthread-border-router",
"sudo snap install openthread-border-router --edge",
"sudo snap set openthread-border-router infra-if=" + infraInterfaceValue,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"sudo snap set openthread-border-router infra-if=" + infraInterfaceValue,
"sudo snap set openthread-border-router infra-if=" + remoteInfraInterfaceEnv,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the current test suite design of the components in each device changes from initial design:

Local Machine:
- chip-tool snap
- OTBR snap with otbr-agent
- all-clusters snap

Remote Machine:
- OTBR snap with Thread network

to

Local Machine:
- chip-tool snap
-OTBR snap with Thread network

Remote Machine:
- OTBR snap with otbr-agent
- all-clusters snap

, this has been updated to remoteInfraInterface.

"sudo snap connect openthread-border-router:bluetooth-control",
"sudo snap connect openthread-border-router:bluez",
"sudo snap start openthread-border-router",
"sleep 10",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The waiting doesn't require a command executed remotely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wait here has been replaced by:

utils.WaitForLogMessage(t, otbrSnap, "Thread Network", start)

as the Thread network formation changed from the remote device to the local device.

}

// getActiveDataset retrieves the active dataset of Thread network from the remote device
func getActiveDataset(client *ssh.Client, t *testing.T) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func getActiveDataset(client *ssh.Client, t *testing.T) (string, error) {
func getActiveDataset(t *testing.T, client *ssh.Client) (string, error) {

For consistency with other function signatures, such as those from testify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reminder, this was an early draft, but thank you for providing detailed feedback anyway. It has been updated now.


// Install OTBR and form Thread network
if err := installOTBR(client, t); err != nil {
t.Fatalf("failed to setup remote Thread network: %s", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printed error messages should be written as a sentence and start with a capital letter. Same below.

Comment on lines 28 to 30
// Remote device: Access remote device, setup Thread network, and get active dataset
activeDataset := setupRemoteThreadNetwork(t)
trimmedActiveDataset := strings.TrimSpace(activeDataset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The active dataset needed to commission should be from a network formed on the local device. The OTBR agent on the remote device is just enabling communication between the matter app and RCP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial design:

Local Machine:
- chip-tool snap
- OTBR snap with otbr-agent
- all-clusters snap

Remote Machine:
- OTBR snap with Thread network

current design:

Local Machine:
- chip-tool snap
- OTBR snap with Thread network

Remote Machine:
- OTBR snap with otbr-agent
- all-clusters snap

Comment on lines 138 to 141
"sudo openthread-border-router.ot-ctl dataset init new",
"sudo openthread-border-router.ot-ctl dataset commit active",
"sudo openthread-border-router.ot-ctl ifconfig up",
"sudo openthread-border-router.ot-ctl thread start",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There aren't required on the remote side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have been removed from remote side now, as the design for this test suite changes, please see #48 (comment).

localInfraInterfaceEnv = "LOCAL_INFRA_IF"
)

func TestAllClustersAppThread(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should not be run on GH actions, as it always fails due to the missing remote machine and RCPs dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test has been moved to a standalone directory "manual_thread_tests".

@MonicaisHer
Copy link
Contributor Author

MonicaisHer commented Mar 19, 2024

Bluetooth related issue:

CHIP:BLE: New device scanned: 00:1A:7D:DA:71:15
CHIP:BLE: Device discriminator match. Attempting to connect.
CHIP:BLE: ChipDeviceScanner has stopped scanning!
CHIP:DL: HandlePlatformSpecificBLEEvent 16386
CHIP:DIS: Closing all BLE connections
CHIP:TOO: Pairing Failure: src/platform/Linux/BLEManagerImpl.cpp:83: CHIP Error 0x00000032: Timeout

@farshidtz
Copy link
Member

Thanks for the tests.

I could run this after several modifications. Going to merge this into a feature branch to work on fixes, including:

  • Fix remote query for logs (wrong input, timezone mismatch)
  • Merge Go modules, a second module is overkill
  • Commission with code, without using BLE

@farshidtz farshidtz changed the base branch from main to thread-testing April 22, 2024 13:10
@farshidtz farshidtz merged commit c210cb1 into canonical:thread-testing Apr 22, 2024
3 checks passed
@farshidtz farshidtz mentioned this pull request Apr 29, 2024
locnnil pushed a commit that referenced this pull request May 2, 2024
* Add manual local Thread commission test (#48)

* Remove obsolete LFS file, update file names

- Commission via code, without using BLE
- Use UTC time on target to ensure consistent queries regardless of local/remote timezones

Inline with doc updates: canonical/matter-docs#22

---------

Co-authored-by: Mengyi Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants