Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-4829 arm32 build #2790

Closed
wants to merge 114 commits into from
Closed

RSDK-4829 arm32 build #2790

wants to merge 114 commits into from

Conversation

abe-winter
Copy link
Member

@abe-winter abe-winter commented Aug 16, 2023

what

  • POC build of viam-server targeting 32-bit arm
  • 32 bit poses two challenges in go: 1) sync/atomic panics, 2) logic errors caused by int32 overflows

high-level changes

This is a huge PR and if reviewers want, I can split it up:

  • go:build tags to toggle off tflite + pigpio
  • new CI tooling:
    • etc/vendorlint: runs linter on entire vendor tree
    • etc/splitter: run test suite in parallel (for speed and to reduce blast radius of flakes)
  • fixes to codebase:
    • struct member order fixes to pass sync/atomic linting
    • several fixes to get test suite passing on 32-bit
  • new CI workflow for 32-bit build + test

overview of sync/atomic linting

  • golang ships the atomicalign analyzer which is available through govet and golangci-lint, but only is active when targeting 32-bit (meaning you are either building natively on a 32-bit system, or targeting 32-bit with e.g. GOARCH=arm)
  • atomicalign catches some but not all cases that would crash on 32-bit architecture

@CLAassistant
Copy link

CLAassistant commented Aug 16, 2023

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ abe-winter
❌ Abe Winter


Abe Winter seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Aug 16, 2023
@abe-winter abe-winter marked this pull request as ready for review September 6, 2023 20:16
@dgottlieb
Copy link
Member

Whow this is a huge accomplishment getting this into PR, well done!

A couple broad comments relating to a large PR and one that touches the deployment lifecycle:

  • Github is notoriously bad at large PRs -- auto-hiding file contents and such (particularly the new files as they are a "large" diff -- certainly nothing I'd care to search through!). So when I'm searching for things like "where are all the places this new thing is used", I'm prone to miss something. Apologies in advance if I miss some references.
  • Because this (conditionally) changes the build process and adds a new github workflow + CI tooling, a summary of how everything is intended to work in the new world in this PR + jira ticket will be greatly appreciated. Is there even a ticket associated with this?
  • Some questions that I think I know the answer to, but stating the answers plainly would help:
    • Are any of the big existing make targets subtly changing in behavior? E.g: server*, test*, lint*.
    • What new jobs are introduced to PRs and pushes to main. What's the consequence of them failing.

new CI tooling:

Are we running vendor lint on every PR and push to main now? Do we intend to release 32-bit builds on the same cadence as everything else now? @Otterverse is there any concern about increased resource consumption? I know we've had issues recently with buildjet machines slowing down due to load causing false positive test failures.

etc/vendorlint: runs linter on entire vendor tree

For posterity, was there something specific to arm32 builds that these helped with? I assume linting will grab some overflow-y and alignment type bugs? Did we only find problems in dependencies in our repositories or did we find some in other dependencies that would be more of a risk/cost to addressing expediently if a new one pops up?

And does this run in CI and will block patches? What's one to do/whom should one reach out to if a "necessary" library upgrade fails linting? Will this run only when the go.mod file is changed?

etc/splitter: run test suite in parallel (for speed and to reduce blast radius of flakes)

Similarly to the load concerns, how broadly is this being adopted and if this is causing too much workload is turning it off safe or will we be seeing "too slow" failures creeping up elsewhere?

I'll leave some more targeted comments in the diff.

@abe-winter abe-winter changed the title arm32 build RSDK-4829 arm32 build Sep 7, 2023
@@ -0,0 +1,4 @@
bin/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these things generated by our make commands that a user/github action is expected to have created now? Or were these artifacts of your personal workflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this was already blocked in our .gitignore -- I added to .dockerignore to make my docker builds start faster

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are in theory nice-to-haves for anyone doing local docker iterations, but are not critical and I can remove if you want

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, though I'm rather unfamiliar. I'll defer to @Otterverse if there's any concerns.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see .direnv anywhere in .gitignore, so not sure where that came from. Ignoring .git is good though, saves some code size for most builds. There's other stuff in .gitignore that could be here too... especially node_modules and .artifact I think.

Also, why ignore **/Dockerfile ?

.github/workflows/server-32bit.yml Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@@ -30,9 +30,6 @@ func init() {
// Encoder keeps track of a motor position using a rotary incremental encoder.
type Encoder struct {
resource.Named

positionType encoder.PositionType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity:

I'm guessing the problem here is that sync.Mutex needs to be on some byte-boundary (4? 8?). And this is to shuffle the ordering such that we satisfy that requirement.

I think this is going to be a common sharp edge for rdk developers. It's a reasonable practice to place mutexes near the variables they are intending to protect and (in Go at least) not worry about runtime constraints for correctness.

So in lieu of having some "gofmt-like" program that can rewrite our structures to be 32-bit mutex compliant, we need to start leaving more documentation around our sync.Mutex variables about why they are placed where they are. Maybe in the future we'll fall into a convention that can avoid some of the documentation.

A concrete ask:

  • Leave a comment here saying: `"sync" and "atomic" primitives must be 4-byte aligned for 32-bit builds."
  • Move activeBackgroundWorkers below this mutex. Mostly so we can write one comment that applies to all the important parts. And so we start leaving good examples that people can crib off of.

I'll leave a comment on the other locations changed in this PR so you don't have to hope you caught all of the ones I'm seeing. But please do not audit the entire code-base to make these changes. I just want to get the ball rolling on communicating this subtle requirement to the broader set of developers at viam.

Offline discussion:
The problem isn't with mutexes but rather atomic accesses to the position, pRaw and pState variables. E.g: here. Consider just changing them to atomic.Int64

@@ -173,6 +173,7 @@ func newEncodedMotor(

// EncodedMotor is a motor that utilizes an encoder to track its position.
type EncodedMotor struct {
rpmMonitorCalls int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline discussion: consider changing this to atomic.Int64.

etc/splitter.py Show resolved Hide resolved
etc/vendorlint.py Show resolved Hide resolved
robot/impl/local_robot_tflite_test.go Show resolved Hide resolved
@@ -7,8 +7,8 @@ import "sync/atomic"
// used for statistical purposes only due to the use of
// atomics and not mutexes.
type RollingAverage struct {
data []int64
pos int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also consider making these atomic.Int64 and []atomic.Int64

etc/splitter.py Show resolved Hide resolved
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 7, 2023
@abe-winter
Copy link
Member Author

re etc/splitter concerns:

  • if you assume that CI has zero startup cost, buildjet has infinite parallelism, and our tests never flake, this has the same resource footprint as current testing
  • (all 3 assumptions are of course false)
  • nonzero startup cost means we can't parallelize infinitely. I think we have 20-30 second startups, so with 5 test shards taking 3-12 minutes each, we're wasting ~5%
  • finite resources at buildjet: I'm assuming they're more interested in 'minutes used at peak' than parallelism; we're only using ~5% more minutes (see setup wastage above). we're also using more IO by doing 1 docker pull per shard
  • yes we can remove this if it causes problems. I think arm32 tests are slower than 64-bit, but not like 2x slower

the benefits of splitter:

  • less clock time developers have to wait for a build
  • if one of the 5 shards fails due to a flake, we can rerun just that shard rather than the entire suite

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 7, 2023
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 7, 2023
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 7, 2023
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 7, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2023

Code Coverage

Code Coverage
Package Line Rate Delta Health
go.viam.com/rdk/cli 11% 0.00%
go.viam.com/rdk/components/arm 58% 0.00%
go.viam.com/rdk/components/arm/fake 27% 0.00%
go.viam.com/rdk/components/arm/universalrobots 10% 0.00%
go.viam.com/rdk/components/arm/wrapper 19% 0.00%
go.viam.com/rdk/components/arm/xarm 22% 0.00%
go.viam.com/rdk/components/arm/yahboom 6% 0.00%
go.viam.com/rdk/components/audioinput 43% -0.47%
go.viam.com/rdk/components/base 60% 0.00%
go.viam.com/rdk/components/base/agilex 60% 0.00%
go.viam.com/rdk/components/base/kinematicbase 45% 0.00%
go.viam.com/rdk/components/base/sensorbase 68% 0.00%
go.viam.com/rdk/components/base/wheeled 89% 0.00%
go.viam.com/rdk/components/board 59% 0.00%
go.viam.com/rdk/components/board/customlinux 50% 0.00%
go.viam.com/rdk/components/board/fake 38% 0.00%
go.viam.com/rdk/components/board/genericlinux 8% 0.00%
go.viam.com/rdk/components/board/numato 19% 0.00%
go.viam.com/rdk/components/board/pi/impl 11% 0.00%
go.viam.com/rdk/components/camera 57% 0.00%
go.viam.com/rdk/components/camera/align 58% 0.00%
go.viam.com/rdk/components/camera/fake 74% 0.00%
go.viam.com/rdk/components/camera/ffmpeg 82% 0.00%
go.viam.com/rdk/components/camera/replaypcd 90% 0.00%
go.viam.com/rdk/components/camera/rtsp 52% 0.00%
go.viam.com/rdk/components/camera/transformpipeline 73% 0.00%
go.viam.com/rdk/components/camera/ultrasonic 61% 0.00%
go.viam.com/rdk/components/camera/videosource 39% 0.00%
go.viam.com/rdk/components/encoder 56% 0.00%
go.viam.com/rdk/components/encoder/ams 63% 0.00%
go.viam.com/rdk/components/encoder/fake 83% 0.00%
go.viam.com/rdk/components/encoder/incremental 80% 0.00%
go.viam.com/rdk/components/encoder/single 86% 0.00%
go.viam.com/rdk/components/gantry 58% 0.00%
go.viam.com/rdk/components/gantry/multiaxis 82% 0.00%
go.viam.com/rdk/components/gantry/singleaxis 81% 0.00%
go.viam.com/rdk/components/generic 79% 0.00%
go.viam.com/rdk/components/gripper 56% 0.00%
go.viam.com/rdk/components/input 88% 0.00%
go.viam.com/rdk/components/input/fake 93% 0.00%
go.viam.com/rdk/components/input/gpio 85% 0.00%
go.viam.com/rdk/components/motor 70% 0.00%
go.viam.com/rdk/components/motor/dimensionengineering 67% 0.00%
go.viam.com/rdk/components/motor/dmc4000 70% 0.00%
go.viam.com/rdk/components/motor/fake 55% 0.00%
go.viam.com/rdk/components/motor/gpio 73% 0.00%
go.viam.com/rdk/components/motor/gpiostepper 72% 0.00%
go.viam.com/rdk/components/motor/tmcstepper 53% 0.00%
go.viam.com/rdk/components/motor/ulnstepper 53% 0.00%
go.viam.com/rdk/components/movementsensor 74% 0.00%
go.viam.com/rdk/components/movementsensor/adxl345 75% 0.00%
go.viam.com/rdk/components/movementsensor/gpsnmea 55% 0.00%
go.viam.com/rdk/components/movementsensor/gpsrtkpmtk 21% 0.00%
go.viam.com/rdk/components/movementsensor/gpsrtkserial 28% 0.00%
go.viam.com/rdk/components/movementsensor/merged 91% 0.00%
go.viam.com/rdk/components/movementsensor/mpu6050 84% 0.00%
go.viam.com/rdk/components/movementsensor/replay 93% 0.00%
go.viam.com/rdk/components/movementsensor/rtkutils 24% 0.00%
go.viam.com/rdk/components/movementsensor/wheeledodometry 78% +0.70%
go.viam.com/rdk/components/posetracker 71% 0.00%
go.viam.com/rdk/components/powersensor 48% 0.00%
go.viam.com/rdk/components/sensor 49% 0.00%
go.viam.com/rdk/components/sensor/ultrasonic 38% 0.00%
go.viam.com/rdk/components/servo 60% 0.00%
go.viam.com/rdk/components/servo/gpio 72% 0.00%
go.viam.com/rdk/config 80% 0.00%
go.viam.com/rdk/control 57% 0.00%
go.viam.com/rdk/data 73% 0.00%
go.viam.com/rdk/examples/customresources/demos/remoteserver 0% 0.00%
go.viam.com/rdk/grpc 25% 0.00%
go.viam.com/rdk/internal/cloud 100% 0.00%
go.viam.com/rdk/ml 67% 0.00%
go.viam.com/rdk/ml/inference 67% 0.00%
go.viam.com/rdk/module 75% 0.00%
go.viam.com/rdk/module/modmanager 82% 0.00%
go.viam.com/rdk/motionplan 78% -0.42%
go.viam.com/rdk/motionplan/ik 68% 0.00%
go.viam.com/rdk/motionplan/tpspace 52% 0.00%
go.viam.com/rdk/operation 83% 0.00%
go.viam.com/rdk/pointcloud 67% 0.00%
go.viam.com/rdk/protoutils 49% 0.00%
go.viam.com/rdk/referenceframe 69% 0.00%
go.viam.com/rdk/resource 77% 0.00%
go.viam.com/rdk/rimage 55% 0.00%
go.viam.com/rdk/rimage/depthadapter 94% 0.00%
go.viam.com/rdk/rimage/transform 70% 0.00%
go.viam.com/rdk/rimage/transform/cmd/extrinsic_calibration 67% 0.00%
go.viam.com/rdk/robot 85% 0.00%
go.viam.com/rdk/robot/client 82% 0.00%
go.viam.com/rdk/robot/framesystem 36% 0.00%
go.viam.com/rdk/robot/impl 83% 0.00%
go.viam.com/rdk/robot/packages 75% 0.00%
go.viam.com/rdk/robot/server 55% 0.00%
go.viam.com/rdk/robot/web 65% 0.00%
go.viam.com/rdk/robot/web/stream 87% 0.00%
go.viam.com/rdk/services/baseremotecontrol 50% 0.00%
go.viam.com/rdk/services/baseremotecontrol/builtin 81% 0.00%
go.viam.com/rdk/services/datamanager 65% 0.00%
go.viam.com/rdk/services/datamanager/builtin 90% +0.49%
go.viam.com/rdk/services/datamanager/datacapture 73% 0.00%
go.viam.com/rdk/services/datamanager/datasync 0% 0.00%
go.viam.com/rdk/services/mlmodel 76% 0.00%
go.viam.com/rdk/services/mlmodel/tflitecpu 85% 0.00%
go.viam.com/rdk/services/motion 52% 0.00%
go.viam.com/rdk/services/motion/builtin 83% 0.00%
go.viam.com/rdk/services/navigation 47% 0.00%
go.viam.com/rdk/services/navigation/builtin 87% 0.00%
go.viam.com/rdk/services/sensors 81% 0.00%
go.viam.com/rdk/services/sensors/builtin 97% 0.00%
go.viam.com/rdk/services/shell 11% 0.00%
go.viam.com/rdk/services/slam 73% 0.00%
go.viam.com/rdk/services/slam/fake 82% 0.00%
go.viam.com/rdk/services/vision 35% 0.00%
go.viam.com/rdk/services/vision/colordetector 56% 0.00%
go.viam.com/rdk/services/vision/detectionstosegments 67% 0.00%
go.viam.com/rdk/services/vision/mlvision 65% +0.54%
go.viam.com/rdk/services/vision/obstaclesdepth 78% 0.00%
go.viam.com/rdk/services/vision/obstaclesdistance 77% 0.00%
go.viam.com/rdk/services/vision/obstaclespointcloud 59% 0.00%
go.viam.com/rdk/session 94% 0.00%
go.viam.com/rdk/spatialmath 82% 0.00%
go.viam.com/rdk/utils 65% 0.00%
go.viam.com/rdk/utils/contextutils 38% 0.00%
go.viam.com/rdk/vision 37% 0.00%
go.viam.com/rdk/vision/delaunay 87% 0.00%
go.viam.com/rdk/vision/keypoints 92% 0.00%
go.viam.com/rdk/vision/objectdetection 69% 0.00%
go.viam.com/rdk/vision/segmentation 54% 0.00%
go.viam.com/rdk/web/server 26% 0.00%
Summary 61% (23174 / 37948) -0.01%

Copy link
Member

@Otterverse Otterverse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should've probably been four different PRs: The splitter testing tool, the vendorlint tool, the 32-bit workflow changes, and then the actual 32-bit fixes. I'm not saying they need to be split up at this point, but I will kind of treat them as different "reviews" in my responses below. I've put a lot of inline notes as well, but I'm going to write a larger overview here that may repeat some of them.

Also, my apologies for how long this is. Lots to address, and a lot of my critiques are things that probably need discussion, so keep in mind I'm wording this as what I think is best, not just saying "do it my way!"

Also-also… some of the inline comments may be out of date. You pushed several updates while I was in the process of reviewing, so some could be obsolete.

32-bit fixes

So first off, great work hunting down all the 64/32 bit and alignment issues. As Dan noted, I'm sure that was a pain to figure out. All that part of this PR looks great to me!

Splitter

The splitter code I have a few questions about. First off, it wasn't really clear to me the purpose, since go test already runs stuff in parallel where possible, and if it was a matter of horsepower, we have access to up to 32-core runners. I'm presuming that it's more about being able to retry failures more quickly? I'm also curious if you looked into existing tools for this, such as https://github.com/hashicorp-forge/go-test-split-action and if so, why they weren't viable.

I also have a slight concern about splitting tests onto different machines actually changing the behavior. We've caught quite a few bugs because tests WERE on the same machine, and processes had file or socket races against each other. At the far extreme (if every test got its own job) that'd mean we never catch issues like that. A handful of chunks doesn't seem too bad if it saves us on retries though.

Lastly I'm not a huge fan of a go-specific testing utility being written in python instead of Go itself (or very basic bash.) It means we now have to always involve two languages in all our build/test environments. It also means anyone debugging issues or looking to improve this now needs to have python expertise in addition to go expertise. This is a pretty complicated utility, and I actually struggled to understand parts of it, as my python skills are not as advanced as yours. I I'm not not saying (necessarily) that these can't remain python, but it is a discussion we should have, and likely with other people involved. The python code itself looks well thought out though, and the idea of speeding things up this way is great.

Vendorlint

First, I think the normal 32-bit lint should be split into a separate makefile target from the vendorlint run. I'm hoping vendorlint only needs to be run rarely, and not every lint check. I don't mind having an extra target there for manual runs/checks when needed, but automating it (with a full go mod vendor run) every lint check feels like a big resource/time hog. I understand why this was needed to initially hunt down the 32-bit issues, but can you clarify why and when this is expected to be needed going forward?

Secondly, by doing "go mod vendor" this changes how all future work in someone's codebase functions, which can disrupt IDEs that use LSP as they were using modules in ~/go/ for reference, so they may need reloads or point to the wrong files. Also, and more importantly, vendoring breaks builds. Once vendored, the pion/mediadevices module can no longer find the opus.h headers, and thus viam-server can no longer compile. Likewise, make lint-go fails due to issues compiling tools.

I definitely think this needs to be a target that only gets run manually when we need to investigate a 32-bit issue. It's too intensive to run in every PR update (I just timed it on your arm image and it took 20 minutes), and on developer machines the vendoring is disruptive. The latter can be overlooked if it's not part of the normal workflow though, just document that a "clean-all" is needed to get back to normal working state.

The same notes I had about splitter being written in python apply here as well.

Workflows/Docker

The actual build process stuff needs a rework I believe, and should follow the pattern used by all our other workflows (the ones that work using the canon docker images.)

The big thing is that the lint, test, and (server) build steps need to be well separated from the creation of the docker image itself, and should be integrated into the existing workflows.

For the docker image build itself... There should be a process to build a docker image that contains a canon-compatible toolchain/env. (The canon utility doesn't have 32 bit arm support yet, but I can add that next week pretty easily. And it's not needed for workflow/CI use.) I know we talked briefly the other day about bespoke being fine for now, but seeing how simple this dockerfile currently is (and how similar the end result is to canon), it might actually be worth it to just go ahead and add this properly as a canon image variant. (e.g. arm instead of arm64) Look at https://github.com/viamrobotics/SysOps/tree/main/docker and you can probably just add a couple build targets to the Makefile there, and tweak the canon/Dockerfile with any arm32 specific conditionals you need. Builds for this shouldn't need to be made/pushed any more frequently than canon itself, so no real need to automate I think (though I may do that anyway next month when updating canon like I talked about.) I think this is additionally a good thing to do now, as the current image being built isn't actually 32-bit.

For the workflow(s) (that are USING the built docker images), I'd really like to see them integrated into existing workflows (e.g. being called by pull-request-trusted.yml, main.yml, or sub-workflows of those) where needed. We definitely don't want this all to just run on every push like it's currently set to do. I'd also like to see them use the workflow native "container" options rather than manually constructing docker run commands. Lastly, everything should be using make targets as much as possible, rather than putting complex commands in the workflow files themselves. It's MUCH easier to debug/update/modify/test these later in the Makefile than in the workflow. It should be pretty easy to automate the arm32 detection and enabled the couple extra build flags you've added.

@@ -85,7 +85,7 @@ require (
go.uber.org/zap v1.24.0
go.viam.com/api v0.1.186
go.viam.com/test v1.1.1-0.20220913152726-5da9916c08a2
go.viam.com/utils v0.1.43
go.viam.com/utils v0.1.44-0.20230905200840-300b46e8f195
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to use a directly tagged/real version number (without the date/time stamp) when this actually publishes. (I know the 1.44 tag doesn't exist yet.) It makes things easier to parse for some workflow automation (we try to sync these versions to match in app as well) and easier for humans to look up/compare changes.

Comment on lines +1 to +2
on:
push:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to be building a new docker image every single push. This needs to be a called workflow, hooked into the proper "top-level" workflows (e.g. pull-request-trusted.yml and/or main.yml)

- uses: docker/build-push-action@v4
with:
file: etc/packaging/32bit/Dockerfile
tags: ghcr.io/viamrobotics/viam-server:armv7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the docker image probably shouldn't be the same as our binary, but it's fine for the moment.

Comment on lines +32 to +36
- uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe you need this when using the image (only when publishing.) The docker images should be public.

.github/workflows/server-32bit.yml Show resolved Hide resolved
Makefile Outdated
Comment on lines 63 to 65
go mod vendor
# parallelism is 1 for now because golangci-lint detects parallel runs and complains
etc/vendorlint.py all --linter $(abspath $(TOOL_BIN)/golangci-lint) --parallel 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vendorlint ALWAYS necessary? Or can it be it's own "vendorlint" target that we only run manually/occasionally? In either case, I think this should be split into two seperate targets, so people working locally can run just the "quick" lint when they're changing their own files (and not modifying vendor stuff.)

registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/build-push-action@v4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to be building a 32-bit image. The image currently at ghcr.io/viamrobotics/viam-server:armv7 is actually 64bit arm when I inspect it as well.

$ docker image inspect ghcr.io/viamrobotics/viam-server:armv7 | grep Arch
        "Architecture": "arm64",

uses: google-github-actions/[email protected]
with:
headers: "cache-control: no-cache"
path: viam-server-latest-armv7l
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is a distributable binary. There's a whole process to do a (mostly) statically linked binary and/or appimages. This appears to just be raw binary, and with the current docker image, a 64-bit one at that.

@@ -0,0 +1,4 @@
bin/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see .direnv anywhere in .gitignore, so not sure where that came from. Ignoring .git is good though, saves some code size for most builds. There's other stuff in .gitignore that could be here too... especially node_modules and .artifact I think.

Also, why ignore **/Dockerfile ?

$(TOOL_BIN)/golangci-lint run -v --build-tags no_tflite,no_pigpio --tests=false --disable-all --enable staticcheck --timeout 30m

vendorlint: tool-install
go mod vendor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an expensive (bandwidth/time) operation. It also leaves the vendor directory permanently which breaks future builds for regular use. To be specific, make server throws errors about missing opus.h after I run this target.

@abe-winter
Copy link
Member Author

refactoring to split out fixes from CI / container changes

@abe-winter abe-winter closed this Sep 11, 2023
@abe-winter abe-winter deleted the arm32-build branch September 11, 2023 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants