Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP_BattMonitor: Handle allocating too many analog channels #28910

Merged

Conversation

WickedShell
Copy link
Contributor

If you over allocate the number of analog channels this results in a crash. It's easy to trigger this if you have voltage only monitors as we still eat up a current input channel, regarless of if we use it. There are only 16 channels at this time on ChibiOS, so if you have 9 voltage only battery monitors you are out.

This PR improves that situation by only allocating channels when needed, and in the case where we run out we now set a ConfigError, which on a flight controller is much more friendly then a instant segfault the moment we read a battery monitor. NOTE: on AP_Periph this takes the node off the bus, rather then just sitting in the bootloader. This was consideted acceptable as the current behaviour was to segfault and then sit in the bootloader, unless you made new firmware that limited the number of channels allocated it wasn't possible to recover in this situation anyways.

Tested on CubeOrangePlus with plane firmware, as well as on AP_Periph.

@WickedShell WickedShell force-pushed the wickedshell/analog-channel-management branch from 1ae7397 to 4ee5dd5 Compare December 19, 2024 20:53
@WickedShell WickedShell force-pushed the wickedshell/analog-channel-management branch from 4ee5dd5 to 946a25f Compare December 19, 2024 20:53
Copy link
Collaborator

@andyp1per andyp1per left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@peterbarker
Copy link
Contributor

How do you recover the periph if it never comes up on the bus?

@WickedShell
Copy link
Contributor Author

How do you recover the periph if it never comes up on the bus?

Sorry, I should have been more clear. The initial bootloader does come up, and you can grab it then. The previous behavior was that it would sit in the bootloader, declaring a watchdog error. So the recovery path is still uploading custom firmware that reduces the number of analog channels in use. Long term getting ConfigError to work on AP_Periph would be a good enhancement.

Copy link
Member

@IamPete1 IamPete1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line needs deleting from AP_BattMonitor_Synthetic_Currentwhich inherits from AP_BattMonitor_Analog

// this copes with changing the pin at runtime
_state.healthy &= _curr_pin_analog_source->set_pin(_curr_pin);

Looks good apart from that.

If you over allocate the number of analog channels this results in a
crash. It's easy to trigger this if you have voltage only monitors as we
still eat up a current input channel, regarless of if we use it. There
are only 16 channels at this time on ChibiOS, so if you have 9 voltage
only battery monitors you are out.

This PR improves that situation by only allocating channels when needed,
and in the case where we run out we now set a ConfigError, which on a
flight controller is much more friendly then a instant segfault the
moment we read a battery monitor. NOTE: on AP_Periph this takes the
node off the bus, rather then just sitting in the bootloader. This was
consideted acceptable as the current behaviour was to segfault and then
sit in the bootloader, unless you made new firmware that limited the
number of channels allocated it wasn't possible to recover in this
situation anyways.
@WickedShell WickedShell force-pushed the wickedshell/analog-channel-management branch from 946a25f to 3404333 Compare December 23, 2024 18:20
@WickedShell
Copy link
Contributor Author

This line needs deleting from AP_BattMonitor_Synthetic_Currentwhich inherits from AP_BattMonitor_Analog

// this copes with changing the pin at runtime
_state.healthy &= _curr_pin_analog_source->set_pin(_curr_pin);

Looks good apart from that.

Done. I'm very surprised that was even in there, just nuked it out. (It was technically a nullptr crash before as well).

Copy link
Member

@IamPete1 IamPete1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rmackay9 rmackay9 merged commit 5b96189 into ArduPilot:master Dec 24, 2024
99 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants