-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#4547: Refactor vmm builder code to simplify logic that creates the microVM to boot #4910
base: main
Are you sure you want to change the base?
Conversation
38e9537
to
6b97c07
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tommady, thx for doing the PR. Overall changes seems solid, but I left some small notes here and there. Also can you change names of the commits to refactor(builder): ...
. This is usually how we treat refactor commits.
Hi @ShadowCurse I’ve addressed your comments, with one note regarding the set_stdout_nonblocking function. Please review and provide your guidance when you have a moment. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now you need to make sure the changes you made pass our CI. You can look at the buildkite/firecracker-pr
line at the bottom of the PR and click details
to look at the CI status. Start with making sure changes do compile. I can see that on x86_86 only errors are absence of doc comments for public functions. For aarch64, well, it does not even compile. If you don't have access to arm system, you can try to use cross to cross compile.
After everything compiles, please move all changes to corresponding commits.
Also please rebase the PR so it only includes your commits.
Hi @ShadowCurse Thank you for your patience and thorough review—it has been an incredible learning experience for me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that it at least compiles, you need to make sure it also does function as expected. If you look at the Buildkite pipelines output: https://buildkite.com/firecracker/firecracker-pr/builds/11856#0193640b-6e7c-4f49-bbcf-cb60c1c946ef/62-392 you will see that there are some unit tests that fail in the same place src/vmm/src/builder.rs:519:51
. Quick look around and it seems the issue is that when pio_device_manager
registers itself, it registers interrupts for a VM, but the interrupt controller is not yet created at that point. So you need to either move creation of interrupt_controller before pio_device_manager
creation, or move this .register_device(..)
call.
You can reproduce all of this errors if you run CI locally. You can find exact commands CI is running looking at the very beginning of the logs like this: https://buildkite.com/firecracker/firecracker-pr/builds/11856#0193640b-6e7c-4f49-bbcf-cb60c1c946ef/62-63
Start by running unit tests. There are a couple of ways to do this:
./tools/devtool test -- integration_tests/build/test_unittests.py
- this will use the docker container to compile and run tests- Compile test with
cargo test -p vmm --no-run
and then run withsudo ./build/cargo_target/debug/deps/vmm-....
(there will be a full path at the end ofcargo test
) With this method some network related unit tests will fail most likely, but you can skip those.
Also you can make sure your changes are sound by to simply trying to launch a VM with them. We have a doc for this: https://github.com/firecracker-microvm/firecracker/blob/main/docs/getting-started.md (in the section about getting a rootfs you can skip the ssh key creation if you don't want to use/test network)
Hi @ShadowCurse, ./tools/devtool test -- integration_tests/build/test_unittests.py
...
integration_tests/build/test_unittests.py::test_unittests PASSED [ 50%]
integration_tests/build/test_unittests.py::test_benchmarks_compile PASSED [100%] All tests passed successfully! 🎉 When you have a moment, please review it again. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4910 +/- ##
==========================================
+ Coverage 84.07% 84.17% +0.09%
==========================================
Files 251 251
Lines 28059 28165 +106
==========================================
+ Hits 23592 23708 +116
+ Misses 4467 4457 -10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
With x86_64 version done, now you need to make sure aarhc64 also compiles/works. If you look at the BK run again: https://buildkite.com/firecracker/firecracker-pr/builds/11886#01936ce9-9b52-4459-ac0a-f8cfbe7a3690/62-624, https://buildkite.com/firecracker/firecracker-pr/builds/11886#01936ce9-9b62-4713-b1eb-33926eacc8b0/62-882 you will see some compilation errors in unit tests and CI test for serial port in a restored VM failing. As I mentioned before, you can use cross to cross compile FC to aarch64. It might even allow some subset of unit tests to be run locally. Or you can use qemu to create an emulated aarch64 VM and test your code there. |
Oh, I see… I assumed the devtool test would handle all platform testing in one sweep. Let me double-check again. |
hmmm so still https://buildkite.com/firecracker/firecracker-pr/builds/11939#019386b4-8ab4-4c7e-9fd4-1507dbddbafb/63-882 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left couple of comments and as I think, the reason for CI failures. After addressing everything please rebase PR on top of main so only your commits are in here.
Alright, CI functional and performance tests are green. The last thing is to make commits look better. As I mentioned before you need to rebase PR on top of main branch so only your commits are visible. This should be as simple as updating you fork of Firecracker and running |
We bumped the snapshot version up twice recently, requiring users to regenerate their snapshot, but the user action isn't clearly stated. Signed-off-by: Takahiro Itazuri <[email protected]> refactor(builder): Refactor vmm builder code to simplify logic eliminate the unnecessary usage of the event_manager argument and fix up aarch64 attach_legacy_devices_aarch64 fn Signed-off-by: tommady <[email protected]> refactor(builder): address comments remove the aarch64 suffix from the attach_legacy_devices_aarch64 function and ensure that aarch64 smt is always set to false int the configure_system_for_boot function Signed-off-by: tommady <[email protected]>
`TcpIPv4Handler` for MMDS network stack preallocates several buffers whose sizes are saved into a snapshot as `max_connections` and `max_pending_resets` in `MmdsNetworkStackState`. But they are always the same constant hardcoded values (`DEFAULT_MAX_CONNECTIONS` and `DEFAULT_MAX_PENDING_RESETS`) as of today, which means there is no need to save them into a snapshot. Even if we change the hardcoded sizes across Firecracker versions, that should not be a problem. This is because the snapshot feature does not support migration of network connections and those buffers are initialized with empty on snapshot restoration. When migrating from a Firecracker version with larger buffers to another version with smaller ones, guest workloads that worked previously might start to fail due to the less buffer spaces. However, the issue is not a problem of the snapshot feature and it should also occur even on a purely booted microVM (not restored from a snapshot). Thus, it is fine to remove those fields from a snapshot. Since this is a breaking change of the snapshot format, bumps the major version. Signed-off-by: Takahiro Itazuri <[email protected]>
We bumped the snapshot version up twice recently, requiring users to regenerate their snapshot, but the user action isn't clearly stated. Signed-off-by: Takahiro Itazuri <[email protected]> refactor(builder): address comments let the x86_64 and aarch64 architectures code can compile and without warnning Signed-off-by: tommady <[email protected]>
make the test all passed Signed-off-by: tommady <[email protected]>
make the aarch64 test all passed Signed-off-by: tommady <[email protected]>
refactor(builder): address comments address comments Signed-off-by: tommady <[email protected]>
There is some information in the Microvm class that we don't save in the snapshot. Some tests do depend on those, so to make the booted/restored case homogenous, make room in the snapshot to save metadata that we can then restore. Signed-off-by: Pablo Barbáchano <[email protected]>
Some further simplifications: - Simplify the "_host" tests as they will provide the same result in both A/B situations. - Add a second MicrovmFactory fixture with the "A" firecracker. - Add a uvm_any fixture that includes all CPU templates and also a boot/restored dimension. This decreases the need for separate tests. For example the guest test test_check_vulnerability_files_ab can now test all variants: 2 (restored/booted) * 3 (kernels) * 9 (cpu templates) = 54 tests Signed-off-by: Pablo Barbáchano <[email protected]>
Use a new uvm_any_booted fixture to make the test simpler. Signed-off-by: Pablo Barbáchano <[email protected]>
Use the new uvm_any fixture to make the tests simpler Signed-off-by: Pablo Barbáchano <[email protected]>
All tests in the file are x86_64 specific, so do the skip at the top-level, once. Signed-off-by: Pablo Barbáchano <[email protected]>
Make the common feature flags to get some clarity on what are the differences between the CPUs. Signed-off-by: Pablo Barbáchano <[email protected]>
This is the same test spread over two files. Putting it together in a new file so they don't drift over time. Signed-off-by: Pablo Barbáchano <[email protected]>
This is a fairly common repeated pattern, so extract it into a method to make writing tests simpler. Signed-off-by: Pablo Barbáchano <[email protected]>
Rewrite test_rng using uvm_any. Signed-off-by: Pablo Barbáchano <[email protected]>
This fixture is not used anymore, as it is mostly superseded by cpu_template_any. Signed-off-by: Pablo Barbáchano <[email protected]>
Since we only have one rootfs, simplify the fixture to return a single value. This will have also not show it as part of the test instance name. Signed-off-by: Pablo Barbáchano <[email protected]>
Add a parameter so that we can control vcpu number and memory. Right now it uses a hardcoded value, but it can be overridden per test. Signed-off-by: Pablo Barbáchano <[email protected]>
aca5101
to
d807ab3
Compare
…boot Streamline the event_manager to avoid passing it through unnecessary functions, isolating the code with distinct architectural designs. Consolidate architecture-specific configurations exclusively within the build_microvm_for_boot and build_microvm_from_snapshot functions for clarity and efficiency. Signed-off-by: tommady <[email protected]>
move interrupt controller into create_vmm_and_vcpus and fix aarm64 integration test Signed-off-by: tommady <[email protected]>
fix(4547): Refactor vmm builder code to simplify logic
Changes
Refactoring Vmm builder code
Reason
close issue #4547
...
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md
.Runbook for Firecracker API changes.
integration tests.
TODO
.rust-vmm
.