Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing CPU vulnerability and mitigation info on the Bcm2712 #3726

Open
ell1e opened this issue Jun 15, 2024 · 35 comments
Open

Missing CPU vulnerability and mitigation info on the Bcm2712 #3726

ell1e opened this issue Jun 15, 2024 · 35 comments

Comments

@ell1e
Copy link

ell1e commented Jun 15, 2024

My apologies if I'm just missing it, but it seems like this page lacks info on CPU vulnerabilities and the existence and effectiveness of mitigations of these in the latest mainline kernel: https://github.com/raspberrypi/documentation/blob/develop/documentation/asciidoc/computers/processors/bcm2712.adoc

This page suggests at least the predecessor was affected by quite a few of the widely found speculative issues: https://www.cvedetails.com/vulnerability-list/vendor_id-5420/product_id-96497/Broadcom-Bcm2711.html

Having definite information on this is to my understanding essential for e.g. any serious ARM64 cloud data center use and may sometimes even impact web browsing safety whenever it affects process isolation, so it would be nice if this was documented properly somewhere. Also, if there are any helpful workarounds to mitigate potential issues further, like disabling hyper threading does on many x64 desktop CPUs, that would also be useful.

@ghollingworth
Copy link
Contributor

Rather than us provide generic information concerning particular vulnerabilities, it would be better to refer to the Arm security center for the particular architecture in question. For A76, this is the right page I believe:

https://developer.arm.com/Arm%20Security%20Center/Speculative%20Processor%20Vulnerability

We would only list vulnerabilities which only affect our particular processor.

Happy to add a link to that page someone a little more general.

@ell1e
Copy link
Author

ell1e commented Jun 17, 2024

The ARM page seems geared at system developers, not so much users and administrators. Let me elaborate:

IMHO the most interesting thing to know is whether a mainline Linux kernel would be expected to have the necessary kernel side software mitigations and/or microcode updates (if the CPU supports them) and/or ARM trustzone firmware updates applied out of the box. Especially the latter two might be rather Raspberry Pi 5 specific questions since to my limited understanding, the boot images and detailed support can vary a lot between different ARM64 boards even with the same CPU on it. My apologies if I'm mistaken here.

One way to provide some info on that would be what the Vulnerabilities table of lscpu would be expected to list for a mainline kernel, for example. A note on whether any average user space application dealing with some secret info like passwords, but not running any untrusted code inside its process, would need to make use of any extra compiler mitigations to retain basic process isolation might also help.

The ARM page specifically seems to lack the following info if I'm reading it right: 1. for many mitigations it only lists kernel mitigations are needed but only for some whether mainline Linux actually has them, 2. for some mitigations trustzone firmware updates are listed as being needed but this doesn't tell a user or admin whether action on their part would be required or whether the kernel or similar are handling this already, 3. some mention of a "Variant 4" bug lists specific revisions of A76 and it's not too obvious to me which one a Raspberry 5 would ship, 4. and generally a table featuring all ARM processors and all their issues rather than just the trimmed down list actually relevant for the Raspberry Pi 5 will be hard to read for many admins.

I hope some of this feedback is helpful for figuring out how to best address this.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@ell1e
Copy link
Author

ell1e commented Aug 19, 2024

The underlying issue still seems to be present with the documentation page.

@JamesH65
Copy link
Contributor

@tdewey-rpi Possible CRA impact?

@tdewey-rpi
Copy link
Contributor

As @ghollingworth points out, if we replicate information from the ARM Security Center, we do risk stale information being presented to our users.

That puts us in a bit of a bind - on the one hand, I'd love to present a page like @ell1e describes (specifically, Raspberry Pi HW, known hardware vulns, known mitigations), but I'd also be extremely reticent to risk presenting stale information and granting people the illusion of coverage because they read a stale page in our documentation.

At the very minimum, however, we could expand the hardware descriptions to include the relevant core revision information. We already make reference to the stepping information from Broadcom on the SoCs we use, so I don't see a major technical readability jump by then including core revision data useful for making security decisions. This would not be subject to third party change in a way that would result in us presenting stale data, and would at least make navigating the ARM table more straightforward.

On the CRA, @JamesH65, I'll have to dig in further between the mixes of shoulds and musts - I believe the labelling and linking to the manufacturer's security information would comply, but I still cannot claim complete knowledge at this time.

@ell1e
Copy link
Author

ell1e commented Aug 20, 2024

Maybe you could then ask ARM to at least add the missing information on which mitigation is actually handled by a mainline kernel in practice? My apologies if I'm just bad at reading, but it seems to be missing in some spots:

Variant 1: Says "Userspace code implementing software privilege boundaries should be reworked" for Linux, but doesn't specify whether the main Linux compiler toolchains gcc or clang were patched upstream to do this and starting from which version.

Variant 2: Says "For Cortex-A8, Cortex-A9, Cortex-A12, Cortex-A15, and Cortex-A17 please apply the kernel patches provided by Arm" for Linux, but doesn't seem to say whether the mainline kernel was patched to do this and starting from which version.

Variant 3: Says "Ensure your kernel implements Kernel Page Table Isolation (KPTI), referring to the patches in the branch above" for Linux, but doesn't seem to say whether the mainline kernel was patched to do this and starting from which version.

Variant 4: Says "Mitigation is based on disabling a hardware feature" for Linux, but doesn't seem to say whether the mainline kernel was patched to do this and starting from which version.

Spectre BHB: Says "View available Kernel patches" for Linux, but doesn't seem to say whether the mainline kernel was patched to do this and starting from which version.

Trusted Firmware-A: This one is listed with multiple patches in multiple places, but seemingly without info whether 1. the user can patch this on Linux and how, or 2. whether the mainline kernel patches this automatically and starting from which version, 3. where to get these updates, etc.

Basically, the info seems at best actionable to kernel developers but mostly useless to end users and system administrator right now. That seems however like what the page https://github.com/raspberrypi/documentation/blob/develop/documentation/asciidoc/computers/processors/bcm2712.adoc should possibly cover, so that end users get a very simple basic criterion like "there are issues x, y, and z, and use kernel x.y.z or newer and it should have all known mitigations for issues x, y, and z". Basically, I think any security-interested end user will just want to know if they need to do anything and if their CPU will be patched in practice or not, so they can make any upgrade or purchase decisions based on that.

Again, sorry if the ARM pages cover all this and I'm just missing the link or section that lists this concisely.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@ell1e
Copy link
Author

ell1e commented Oct 20, 2024

The issue seems to be unsolved.

@ell1e
Copy link
Author

ell1e commented Oct 20, 2024

Is the stale marker bot perhaps somehow not working right? Sorry if it is and I'm just using up everybody's time.

@by
Copy link

by commented Nov 8, 2024

Why not simply put the respective links to https://developer.arm.com and https://developer.arm.com/Arm%20Security%20Center into the docs?

@ell1e
Copy link
Author

ell1e commented Nov 8, 2024

It would be a start, but as I pointed out above, ARM doesn't seem to list that much concrete info on the practical situation on Linux for admins (rather than kernel devs). Unless I missed something, which admittedly wouldn't be the first time 😛

@ell1e
Copy link
Author

ell1e commented Dec 9, 2024

Could somebody perhaps paste the lscpu output of a recent 6.11 or 6.12 kernel on a Raspberry 5? Unlike all the pages discussed above, the kernel seems to list known vulnerabilities with concrete confirmation whether they're mitigated with the current, actual software stack running in practice as of today. (Not just whether in theory this could be mitigated, which is what ARM seems to list.) This is the information I am still hoping will be added to the documentation at some point.

@JamesH65
Copy link
Contributor

JamesH65 commented Dec 9, 2024

Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; __user pointer sanitization
Spectre v2: Mitigation; CSV2, BHB
Srbds: Not affected
Tsx async abort: Not affected

@ell1e
Copy link
Author

ell1e commented Dec 9, 2024

Thanks so much for the lscpu output, that's lovely and really helpful!! ❤️

to risk presenting stale information

I thought about this some more, couldn't this be rectified by simply putting a clear publishing date and disclaimer that the info might be outdated? The links to ARM and such could remain in place for potentially newer info.

@nathan-contino
Copy link
Collaborator

@JamesH65 Should I add something to the docs that suggests consulting lscpu for this information? For anyone who owns a Raspberry Pi, that seems like the best way to answer this question.

@tdewey-rpi
Copy link
Contributor

@JamesH65 Should I add something to the docs that suggests consulting lscpu for this information? For anyone who owns a Raspberry Pi, that seems like the best way to answer this question.

That's probably a good start - but you should also make clear that this output may change as updates are delivered and new vulnerabilities discovered. A note on periodically checking would be appropriate.

@ell1e
Copy link
Author

ell1e commented Dec 10, 2024

Could you perhaps add the lscpu output to the documentation? I don't really buy devices when I know they have known unpatched CPU vulnerabilities, since often vendors don't care to ever address it. Perhaps that's not just me.

@lurch
Copy link
Contributor

lurch commented Dec 10, 2024

Could you perhaps add the lscpu output to the documentation?

Tom's reply implies that this lscpu output will be changing regularly, and I guess it'll also vary by model of Raspberry Pi and by Linux kernel version. We simply don't have the resources to try and keep such a "moving target" up to date in our documentation.

EDIT: And to clarify, I suspect this is one of those situations where out-of-date information may be more harmful than no information at all.

@ell1e
Copy link
Author

ell1e commented Dec 10, 2024

I don't think if you put a clear date next to the info and update it every few months or so, that it would do any harm. Usually, you would check something like that after a new vulnerability was announced, and if the vulnerability is newer than a dated lscpu output or the kernel version used, obviously it isn't going to show anything relevant.

Perhaps updating the output regularly could help? You could find people willing to update the output for you. I'd be interested in doing so for a 5, I might get one soon.

@by
Copy link

by commented Dec 10, 2024

If you de-synchronize information by copying and displaying it elsewhere, you take (at least implicit) responsibility for its correctness etc.; I‘d recommend avoiding that and stick with linking etc.

@lurch
Copy link
Contributor

lurch commented Dec 10, 2024

I don't really buy devices when I know they have known unpatched CPU vulnerabilities, since often vendors don't care to ever address it. Perhaps that's not just me.

Unlike many other SBC vendors (and apologies for blowing our own trumpet 🎺 ), Raspberry Pi is actually quite good at keeping up to date with mainline kernels. Our Raspberry Pi OS is currently on 6.6.62 from rpi-6.6.y and we're already starting to test rpi-6.12.y with the next option of rpi-update.

@ell1e
Copy link
Author

ell1e commented Dec 10, 2024

Raspberry Pi is actually quite good at keeping up to date

Some CPU issues need microcode patches that never materialize. E.g. Intel Skylake seems to be unfixable due to this. That's why it's so interesting to have the real, shipped mitigations info somewhere.

I‘d recommend avoiding that and stick with linking etc.

I actually agree. However, there seems to be no complete source on the web to link here, hence why I think having the lscpu output of a recent kernel somewhere would be useful.

@by
Copy link

by commented Dec 10, 2024

Then what about suggesting consulting the respective lscpu output of one's local configuration?

@ell1e
Copy link
Author

ell1e commented Dec 10, 2024

The problem with that is it doesn't work for those who want to check in advance whether a device has known unaddressed vulnerabilities before obtaining it.

@by
Copy link

by commented Dec 10, 2024

But isn't the whole issue more relevant for professional users buying in batches? – So they could start with one sample.
Anyway, any best practices from other manufacturers?

@ell1e
Copy link
Author

ell1e commented Dec 10, 2024

I assumed other end users also prefer to get a cpu patched device.

@JamesH65
Copy link
Contributor

I think telling users that lscpu will give them the current threats and mitigations is appropriate. Effectively, lscpu IS the documentation. Putting a date on a copy of the docs does not work, because as soon as you copy the list to our docs, it could be out of date. Putting the date on it makes no difference to that.

@ell1e
Copy link
Author

ell1e commented Dec 10, 2024

But if you don't put it into the documentation, it seems to me like it will be unavailable to the majority of interested buyers. Alas, I'm repeating myself.

@JamesH65
Copy link
Contributor

But as soon as you put it in the docs, it's out of date and useless/inaccurate anyway. Since you are, I think, the only person who has asked for this, I suspect it's not a huge issue. Can I ask what your reasons are for knowing this up front? Is there some company or government policy somewhere that requires this information?

@ell1e
Copy link
Author

ell1e commented Dec 11, 2024

It's simply that I've had more than one device that I bought that didn't ship with up-to-date microcode with Linux with no easy way to fix. The awareness generally seems to be rising.

I don't think it's useless with a clear date and disclaimer and updated every few months.

@JamesH65
Copy link
Contributor

But it won't get updated. Unfortunately, we simply don't have the mechanisms in place or indeed the time to be constantly updating stuff like this. I believe there should only ever be one ultimate source of truth, and in this case that is lscpu. Anything else is effectively outdated as soon as it appears, because not knowing if it is accurate is the same as outdated - unreliable and misleading.

From Pi point of view, using the latest RPiOS and keeping updated always gets the latest firmware and kernel, which is why we always recommend using the latest versions. You can then use lscpu to check whether any specific vulnerability is present.

@dp111
Copy link
Contributor

dp111 commented Dec 11, 2024 via email

@lurch
Copy link
Contributor

lurch commented Dec 11, 2024

From Pi point of view, using the latest RPiOS and keeping updated always gets the latest firmware and kernel, which is why we always recommend using the latest versions. You can then use lscpu to check whether any specific vulnerability is present.

And this also means that if you're using a 3rd-party OS rather then the official Raspberry Pi OS, that other OS might be using an older version of the Linux kernel, which might be missing the latest CPU vulnerability mitigations. Again, the only way to be sure is to try running lscpu on the relevant OS yourself.

@ell1e
Copy link
Author

ell1e commented Dec 11, 2024

I understand the concerns with the resources to keep it updated, but I don't understand claims it's not useful.

E.g. the older kernel remark, a newer kernel listing will still show if in practice vendor patches, whether microcode or Linux contributions, are available at all with newer kernels. Sometimes they are not, then this info can be god-sent. Even when it's a few months behind, it can be a useful starting point to go check afterward whether an unpatched problem has had some upcoming vendor fix announced or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants